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entropy  design  methodology: 

1.  Whereas  the  dimension  of  an  LQG  controller  must  equal  that  of  the 
controlled  plant,  optimal  projection  design  characterizes  the 
quadratically  optimal  controller  of  fixed  dimension  less  than 
that  of  the  plant  in  accordance  with  implementation  constraints 
(e.g,,  reliability,  complexity  or  real-tine  computing  capability). 

2.  Whereas  LQG  presumes  exact  knowledge  of  each  and  every  parameter 
appearing  in  the  state-space  plant  description,  maximum  entropy 
modelling  provides  a  stochastic  plant  model  which  admits  ignorance 
with  regard  to  parameter  values  in  accordance  with  unavoidable 
plant  modelling  errors. 

The  principal  aim  of  this  research  is  to  investigate  the  impact  of 
flexible  spacecraft  structural  modelling  uncertainties  and  dimensionality 
on  active  control  design  through  the  exploitation  of  the  maximum  entropy 
stochastic  modelling  approach  together  with  the  optimal  projection 
formulation  of  quadratically  optimal,  fixed-order  dynamic  compensation. 
Within  this  context,  the  main  goal  is  to  implement  and  demonstrate  a 
unified  design  synthesis  technique  which  permits  direct  design  in  the  face 
of  incomplete  system  information  while  preserving  a  well-defined  notion 
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INTRODUCTION 


Increased  interest  in  deploying  large  flexible  spacecraft  has  focussed 
attention  on  active  structural  control  techniques  to  achieve  crucial  advances  in 
vibration  suppression,  pointing  accuracy  and  shape  control.  The  extreme 
complexity  of  such  systems  and  the  lack  of  accurate  finite-element  structural 
models  present  severe  control-design  challenges  which  were  extensively  documented 
by  DARPA's  ACOSS  Program.  Optimal  Projection/Maximum  Entropy  Stochastic  Modelling 
and  Reduced-Order  Design  Synthesis  is  a  rigorous  new  approach  to  this  class  of 
problems  conceived  by  Dr.  D.  C.  Hyland  in  [2-16].  Inspired  by  statistical  Energy 
Analysis  ([1]),  a  branch  of  dynamic  modal  analysis  developed  for  analyzing 
acoustic  vibration  problems,  its  present  stage  of  development  ([2-30],  see 
Appendices  A-P),  embodies  a  mathematically  rigorous,  fundamental  generalization  of 
classical  steady-state  Kalman  filter  and  linear-quadratic-Gaussian  (LQG)  optimal 
control  theory. 

Although  LQG  theory  is  an  effective  tool  for  optimally  quantifying 
performance/sensor-resolution  and  performance/actuation-level  tradeoffs,  it 
suffers  from  two  fundamental  defects  which  severely  limit  its  usefulness  in 
practice.  These  defects  are  remedied  by  the  optimal  projection/maximum  entropy 
design  methodology: 

1.  Whereas  the  dimension  of  an  LQG  controller  must  equal  that  of  the 
controlled  plant,  optimal  projection  design  characterizes  the 
quadratically  optimal  controller  of  fixed  dimension  less  than  that 
of  the  plant  in  accordance  with  implementation  constraints  (e.g., 
reliability,  complexity  or  real-time  computing  capability). 

2.  Whereas  LQG  presumes  exact  knowledge  of  each  and  every  parameter 
appearing  in  the  state-space  plant  description,  maximum  entropy 
modelling  provides  a  stochastic  plant  model  which  admits  ignorance 
with  regard  to  parameter  values  in  accordance  with  unavoidable 
plant  modelling  errors. 
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With  regard  to  the  latter  item,  it  should  be  stressed  that  one  of  the 
major  problems  in  designing  high-performance  control  systems  is  that  of 
robustness,  i.e.,  the  ability  of  the  controller  to  tolerate  errors  in  the  plant 
model  upon  which  its  design  is  predicated.  Maximum  entropy  modelling  directly 
addresses  this  problem  by  incorporating  into  the  dynamic  model  a  representation  of 
ignorance  (i.e.,  uncertainty)  regarding  physical  parameters.  Roughly  speaking, 
the  idea  behind  the  approach  is  to  use  a  probabilistic  representation  of  each 
imperfectly  known  plant  parameter.  The  quadratically  optimal  control  system 
designed  in  the  presence  of  this  probabilistic  model  is  automatically  desensitized 
to  actual  parameter  variations  when  the  control  system  is  implemented.  The 
overall  control-design  procedure  thus  avoids  laborious  trial-and-error  post-design 
■tweaking." 


The  validity  of  the  above  claims  has  been  demonstrated  on  a  series  of 
representative  structural  models  of  increasing  complexity.  Beginning  with  an 
illustrative  simply  supported  beam  for  conceptual  validation  ([5]),  the  theory  has 
since  been  applied  to  both  a  20-state  version  of  the  Draper  Model  #2  ([19], 
Appendix  A)  and  a  16-state  model  of  NASA's  SCOLE  experiment. 

Although  the  early  development  of  the  optimal  projection/maximum 
entropy  approach  was  documented  in  numerous  conference  proceedings,  a  series  of 
manuscripts  has  recently  entered  the  realm  of  archival  publications.  Reference 
[23]  (Appendix  B)  recently  appeared  in  the  November  issue  of  the  IEEE  Transactions 
on  Automatic  Control  and  reference  [29]  (Appendix  C)  has  been  accepted  as  a  full 
paper  by  this  journal.  Reference  [28]  (Appendix  D),  which  rigorously  extends  the 
results  of  [23]  to  distributed  parameter  systems,  is  scheduled  to  appear  in  SIAM 
Journal  on  Control  and  Optimization.  In  addition,  a  series  of  manuscripts 
( [ 31 ] — [ 34  ] ,  see  also  Appendix  E)  is  under  preparation  detailing  the  results  of  the 
initial  12-month  contract  period.  These  papers,  which  will  be  submitted  to  both 
conferences  and  journal  publications,  will  considerably  broaden  the  theoretical 
scope  of  the  OP/ME  approach  and  further  demonstrate  its  practical  applicability. 

1.1  study  Overview 

The  principal  aim  of  this  effort  is  to  investigate  the  impact  of 
flexible  spacecraft  structural  modelling  uncertainties  and  dimensionality  on 
active  control  design  through  the  exploitation  of  the  minimal  data/maximum  entropy 
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stochastic  modelling  approach  together  with  the  optimal  projection  formulation  of 
quadratically  optimal,  fixed-order  dynamic  compensation.  Within  this  context,  the 
principal  aim  is  to  implement  and  demonstrate  a  unified  design  synthesis  technique 
which  permits  direct  design  in  the  face  of  incomplete  system  information  while 
preserving  a  well-defined  notion  of  optimality.  The  tasks  required  to  accomplish 
the  goals  of  this  contract  are  discussed  in  detail  within  the  original  technical 
proposal  and  are  briefly  summarized  as  follows: 

Task  1:  Implement  fully  mechanized  construction  of  the  uncertainty 
operators  employed  by  the  maximum  entropy  modelling  approach.  These 
uncertainty  operators  would  be  based  on  very  general  descriptions  of 
structural  parameter  uncertainties  as  represented  in  a  variety  of 
vector  bases.  The  purpose  of  this  task  is  to  facilitate  interaction 
between  control-system  designers  and  structural  analysts  by 
streamlining  the  stochastic  modelling  process  and  extending  maximum 
entropy  modelling  theory  and  its  mechanizations  to  the  treatment  of 
uncertainty  in  general  classes  of  physical  structural  parameters. 

Task  2:  Develop  and  test  techniques  for  numerically  solving  the 

fixed-order  dynamic-compensation  optimality  conditions  given  in  [7]. 

These  computational  techniques  would  be  implemented  in  a 

■user-friendly"  form  to  facilitate  the  design  process  and  would  be 

2 

capable  of  handling  design  models  of  high  order  (10  states).  This 
task  is  planned  in  two  stages: 

2. A.  Develop  and  test  techniques  for  solving  the  optimal 

projection  equations  for  deterministically  parametered  system 
models  and/or  maximum  entropy  stochastic  models  in  the 
coherent  regime.  Such  techniques,  by  employing  suitable 
variants  of  standard  LQG  software  modules,  should,  therefore, 
be  capable  of  handling  problems  of  modest  dimension  (<20 
modes) . 

4 

2.B.  Develop  advanced,  relaxation-type  solution  techniques  that 
exploit  the  crucial  incoherence  and  isotropy  effects  of 
maximum  entropy  models.  These  techniques  would  incorporate 
the  developments  in  2. A  while  permitting  treatment  of  very 
large  dimensional  systems. 
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Task  3:  Demonstrate  the  design  capabilities  developed  in  Tasks  1  and  2 
on  the  EPAR  and  Draper  Model  #2  spacecraft  models  as  well  as  on  simple 
high-order  examples  (beams,  plates,  and  the  like).  As  argued  in  [8], 
the  mechanizations  to  be  developed  under  Task  2  above  would  constitute 
basic  computational  tools  possessing  a  variety  of  unique  features  and 
could  play  a  pivotal  role  in  both  structural  and  control-system  design. 

Our  goal  within  the  first  year  of  this  study*  was  to  complete  Task  2. A 
and  initiate  Task  3.  This  and  more  have  been  accomplished.  In  particular,  Task  1 
was  addressed,  and  new  theoretical  and  practical  results  were  obtained  for  the 
basic  maximum  entropy  approach.  Task  2. A  is  largely  complete,  and  quadratically 
optimal,  low-order  control  designs  were  demonstrated  for  a  version  of  the  CSDL 
Model  #2.  In  addition,  the  combined  optimal  projection/maximum  entropy  design 
method  has  been  developed,  implemented,  and  tested  on  a  spacecraft  example  problem 
of  intense  current  interest. 

The  above  developments  are  outlined  in  the  following  sections  of  this 
report.  Since  many  of  these  investigations  have  appeared  or  will  shortly  appear 
in  the  technical  literature,  we  confine  the  presentation  to  a  discussion  of 
results  with  explanatory  material  as  needed. 

Section  2.0  reviews  recent  theoretical  advances  in  maximum  entropy 
modelling.  Work  prior  to  the  present  effort  considered  a  somewhat  restricted 
class  of  parameter  uncertainties  occurring  in  structural  modelling  and  induced  a 
maximum  entropy  stochastic  model  from  a  set  of  parameter  statistical  data 
consisting  of  the  relaxation  time-scales.  Although  the  relaxation  times 
constitute  the  minimum  data  set  needed  to  induce  a  well-defined  probability  model 
via  the  maximum  entropy  principle,  their  use  as  basic  descriptors  of  the  degree  of 
parameter  uncertainty  differs  somewhat  from  traditional  engineering  practice.  To 
close  this  gap.  Section  2.0  presents  the  maximum  entropy  stochastic  model  for 
which  the  available  parameter  data  is  given  in  terms  of  bounds  on  parameter 
deviations  from  nominal  values.  This  new  formulation  readily  extends  to  a  very 
general  class  of  parameter  uncertainties  and  thus  considerably  generalizes  earlier 
results. 


♦See  'Addendum  to  Section  5.0  of  the  Contract  Proposal,'  14  October,  1983,  which 
gives  the  revised  work  schedule. 
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In  addition,  we  consider  what  is  needed  in  the  formulation  in  order  to 


provide  an  a  priori  guarantee  of  performance  over  the  stipulated  range  of 
parameter  variations,  it  turns  out  that  such  a  guarantee  is  afforded  by  a 
relatively  innocuous  and  straightforward  augmentation  of  the  basic  maximum  entropy 
model.  It  should  also  be  pointed  out  that  the  extended  maximum  entropy 
formulation  can  be  applied  to  nonlinearities  obeying  sector  inequalities  and  to 
structural  perturbations  which  are  important  to  fault  tolerance/reliability 
questions. 


Section  3.0  deals  with  the  optimal  projection  approach  to  designing 
quadratically  optimal,  reduced-order  dynamic  controllers  for  high-order  systems. 
First,  we  present  the  fundamental  optimal  projection  equations  for 
finite-dimensional  systems  with  deterministic  parameters  and  discuss  various 
computational  algorithms  for  their  solution.  Optimal  projection  design  results 
for  a  variant  of  the  CSDL  Model  #2  spacecraft  control  problem  are  reviewed  and 
compared  with  the  results  of  various  suboptimal  controllers  obtained  by  means  of 
order-reduction  schemes.  These  results  demonstrate  the  completion  of  Task  2. A. 

Recent  theoretical  extensions  of  the  original  optimal  projection 
formulation  are  also  pointed  out  in  this  section.  First,  its  extension  to 
distributed  parameter  systems  has  been  carried  on  in  [28]  and,  second,  the  optimal 
projection  equations  for  finite-dimensional  systems  characterized  by  a  general 
maximum  entropy  stochastic  model  reflecting  uncertainties  in  all  system  matrices 
are  given  in  [32],  The  resulting  design  equations  provide  a  unified  basis  for 
subsequent  practical  implementations. 

In  addition,  the  essential  optimal  projection  idea  gives  rise  to  new 
and  more  powerful  approaches  to  two  problems  of  longstanding  importance  in  systems 
theory:  optimal  model  reduction  and  quadratically  optimal  reduced-order  state 
estimation.  The  fundamental  optimal  projection  equations  for  these  two  problems 
are  given  in  [29]  and  [30]  in  Appendices  C  and  F. 

Section  4.0  returns  to  the  main  theme  by  outlining  the  combined  optimal 
projection/maximum  entropy  (OP/ME)  design  approach.  The  results  summarized  in 
Sections  2.0  and  3.0  are  applied  to  a  structural  system  having  uncertainties  in 
the  stiffness  matrix.  Computational  techniques  are  presented  and  illustrated  by 
very  recent  results  obtained  for  the  NASA/IEEE  Design  Challenge  configuration. 
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Review  of  the  Maximum  Entropy  Modelling  Philosophy 


To  set  the  stage  for  subsequent  developments  we  first  review  the  basic 
scope  and  philosophy  of  our  approach  to  modelling  uncertainty  and  the  results 
obtained  for  dynamic  systems  with  parameter  uncertainties. 


First,  as  to  the  general  scope  of  these  investigations,  our 
considerations  will  be  confined  within  the  context  of  linear,  time-domain, 
finite-dimensional  (but  large-order)  system  models.  Moreover,  we  shall  emphasize 
those  design  problems,  e.g.,  shape  and  pointing  control  of  large  spacecraft,  which 
require  high  performance  with  robustness  restricted  to  the  lowest  levels 
consistent  with  parameter  deviations  to  be  expected  in  practice. 


The  restriction  to  linear  system  design  makes  sense  once  one  recalls 
the  inevitable  tradeoff  between  robustness  and  performance.  For  example, 
performance  requirements  during  deployment  and  erection  are  relatively  benign 
while,  at  the  same  time,  dynamic  nonlinearity  due  to  large  angle  relative  motions 
of  structural  components  engenders  modelling  difficulties.  This  problem  calls  for 
a  low  performance,  very  robust  controller  and  can  be  treated  effectively  by  a  low 
performance  but  inherently  dissipative,  e.g.,  positive-real,  controller.  Thus, 
the  specific  impact  of  parameter  uncertainties  is  not  a  critical  issue  in  this 
situation.  In  contrast,  however,  the  stringent  tolerances  imposed  in  many 
applications  upon  steady-state  figure  and  line-of-sight  (LOS)  control  demand  the 
highest  authority,  highest  gain  control  consistent  with  the  minimum  permissible 
robustness  levels.  Such  a  control  design  forces  a  detailed  examination  of  the 
influence  of  specific  modelling  errors  while,  due  to  the  performance  requirements, 
elastic  deformations  are  sufficiently  small  that  dynamic  nonlinearities  may  be 
neglected  (at  least  for  steady-state  LOS  control).  Thus,  while  extension  of  our 
approach  to  nonlinear  dynamics  would  certainly  be  of  interest,  the  first  order  of 
business  remains  the  treatment  of  modelling  uncertainties  in  linear  systems. 


To  drive  the  design  process  we  utilize  a  quadratic  functional  averaged 
over  the  parameter  statistical  ensemble  as  our  mean-square  performance  measure. 
Although  various  criticisms  have  been  leveled  at  quadratic  optimization,  it 
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entails  the  simplest  and  most  familiar  performance  criterion.  Although  extension 
of  the  formulation  to  more  sophisticated  costs  and  performance  measures  would  be 
of  interest,  until  one  is  fully  capable  of  dealing  with  the  issues  of  parameter  • 
uncertainty  and  large  dimensionality,  this  remains  a  fruitless  undertaking. 
Moreover,  the  specific  consequences  of  uncertainties  even  for  mean-square  optimal 
design  are  not  fully  understood  and,  therefore,  constitute  the  subject  of  this 
work.  Most  importantly,  however,  it  should  be  emphasized  that  quadratic 
performance  criteria  (e.g.,  mean-square  pointing  error  and  mean-square  reflective 
surface  deformations)  are  unquestionably  meaningful  for  applications  such  as 
large-aperture,  spaceborne  antenna  systems. 


The  above  restrictions  to  time-domain  modelling  and  quadratic 
optimization  already  indicate  a  fundamental  philosophical  distinction  between  this 
work  and  the  frequency  domain  L^/H^  theory  currently  under  development  (see 
( 297 ] — ( 3 01  ]) .  Moreover,  our  extensive  use  of  stochastic  systems  theory  stands  in 
stark  contrast  to  the  utter  determinism  of  frequency  domain  theory.  In  subsequent 
work  we  shall  evaluate  the  relative  merits  of  the  two  approaches  as  applied  to 
structural  control  problems.  For  the  present,  this  matter  will  not  be  dwelt  upon 
since  we  wish  to  indicate  how  the  maximum  entropy  approach  is  itself  a  fundamental 
extension  of  the  now  classical  theory  of  quadratic  optimization  for  linear  systems 
(e.g.,  [83]),  hereafter  referred  to  as  LQG  theory. 

The  evolution  of  the  maximum  entropy  approach  as  motivated  by  problems 
encountered  in  the  application  of  LQG  theory  can  be  visualized  as  in  Figures  2.1-1 
to  2.1-3.  Referring  to  Figure  2.1-1,  the  model  of  the  plant  (assumed  to  be  a 
structural  system  here)  may  be  suitably  parameterized  (e.g.,  the  parameters  might 
be  the  numerical  values  of  the  individual  elements  of  the  various  system  matrices 
referred  to  some  vector  basis)  so  that  both  the  modelled  system  and  the  actual 
system  are  defined  by  their  locations  within  the  associated  Euclidean  space  of 
parameters.  LQG-based  design  approaches  implicitly  assume  that  all  system  maps 
are  known  and,  consequently,  produce  a  design  which  is  optimal  (with  respect  to  a 
quadratic  performance  measure)  only  for  a  single  point  (associated  with  the 
nominal  values)  in  parameter  space.  However,  due  to  actual  in-mission  changes, 
mathematical  modelling  errors  arising  from  truncations  implicit  in  the  finite 
element  method,  etc.,  all  system  parameters  are  not,  in  fact,  known.  Thus,  to  put 
the  matter  in  the  most  general  terms,  a  structural  model  can  never  encompass  the 
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Figure  2.1-1.  Modal  Parameter  Space 
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•truth" ;  rather,  a  model  should  be  regarded  as  a  mathematical  statement  of  what 
and  how  much  is  known.  Considered  as  such,  a  model  must  not  only  specify  nominal 
values  but  must  also  contain  an  admission  of  prior  ignorance  regarding  parameter 
deviations  from  expected  values. 

An  admission  of  prior  uncertainty  can  indeed  be  quantified  by  assuming 
the  parameters  to  be  distributed  according  to  some  probability  law  (see  Figure 
2.1-2,  where  the  shading  indicates  the  region  of  significant  probability).  An 
important  point  is  that  the  fundamental  concept  of  probability  being  employed  here 
is  not  the  traditional  relative  frequency  interpretation  but  rather  the 
information-theoretic  interpretation,  i.e.,  probability  as  a  measure  of  a  priori 
uncertainty.  This  matter  is  elaborated  in  detail  below. 

Returning  to  Figure  2.1-2,  one  might  be  tempted  to  assume  that  all 
necessary  probability  distributions  are  given  and  proceed  to  design  a  control 
which  is  optimal  "on  the  average"  by  minimizing  the  expected  value  of  a  quadratic 
cost.  Since  the  quantity  of  interest  is  quadratic  in  the  states,  the 
second-moment  matrix  of  the  system  must  be  deduced  from  the  given  probabilistic 
description  of  the  multiplicative  parameters.  This  is  a  longstanding  problem  in 
stochastic  system  theory  and  the  work  of  Kistner  [251]  illustrates  the  significant 
difficulties  of  the  problem.  In  general,  determination  of  the  second-moment 
matrix  for  random  multiplicative  parameters  demands  solution  of  an  infinite 
sequence  of  systems  of  ordinary  differential  equations,  i.e.,  the  second-moment 
equation  is  not  closed. 

Such  practical  difficulties  aside,  a  deeper  problem  arises:  the  above 
standard  problem  in  stochastic  systems  theory  presumes  at  the  outset  that  a 
complete  probabilistic  model  is  given  whereas,  in  reality,  a  complete 
probabilistic  description  is  never  available  from  empirical  determinations. 

Indeed,  before  undertaking  the  usual  procedures  of  stochastic  systems  theory,  one 
must  induce  a  complete  probability  model  from  a  highly  incomplete  set  of  available 
data.  A  fundamental  logical  requirement  is  that  this  be  done  in  a  manner  which 
avoids  inventing  data  which  does  not  exist l  in  other  words,  it  is  necessary  to 
construct  a  complete  probability  assignment  which  is  consistent  with  the  data  at 
hand  but  admits  the  greatest  possible  prior  ignorance  with  regard  to  all  other 
data.  This  is  the  heart  of  the  maximum  entropy  modelling  idea.  The  appropriate 
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quantitative  procedure  has  been  given  by  Jaynes  ([176-179]):  first  define  a 
measure  of  prior  ignorance,  i.e.,  the  entropy  (as  in  information  theory,  not 
thermodynamics),  then  determine  the  probability  law  which  maximizes  this 
functional  subject  to  the  constraints  imposed  by  available  data. 

Having  overcome  the  difficulties  imposed  by  incomplete  available  data 
in  this  way,  it  should  be  noted  that  for  flexible  mechanical  systems  one  can 
identify  a  ■minimum  data  set*  which  is  just  sufficient  to  induce  any  well-defined 
maximum  entropy  model.  In  other  words,  all  admissible  sets  of  available  data  must 
include  the  minimum  set,  and  lack  of  any  element  of  this  minimum  set  will  cause 
the  induced  maximum  entropy  model  to  "blow  up*  in  certain  crucial  respects. 

Since,  in  practice,  one  is  provided  with  little  or  no  prior  statistical  data,  it 
is  not  only  design  conservative,  but  also  realistic  to  acknowledge  as  available 
data  only  the  minimum  data  set. 

Thus,  as  sketched  in  Figure  2.1-3,  our  stochastic  design  approach 
involves  three  main  stages  (see  References  [3],  [8]  and  [11]).  First,  the  minimum 
data  set  is  constructed  and  appropriate  numerical  values  are  assigned;  next,  a 
maximum-entropy  probability  model  is  induced  from  the  minimum  data  (giving  the 
basic  design  model);  and,  finally,  a  mean-square  optimal  design  is  determined 
under  the  maximum-entropy  statistics.  This  procedure  gives  us  a  mechanism  for 
incorporating  incomplete  system  information  within  the  control  design.  Moreover, 
as  Figure  2.1-3  suggests,  the  maximum-entropy  model  is  maximally  dispersed  in 
parameter  space,  and  one  can  guarantee  that  the  resulting  design  will  very  greatly 
reduce  the  probability  of  severe  performance  degradation  in  the  face  of  parameter 
deviations. 


To  recapitulate  previous  work  in  maximum  entropy  modeling  and  to 
illustrate  how  the  above  procedure  is  specifically  carried  out,  consider  the  system 


where 


x.  =  Ax  +  iaXx  +  w;  x  £°  (2.1) 

A  ■  nominal  value  of  dynamics  matrix  (A  +  A*  assumed  stable) 
iaA  ■  perturbation  in  A  representing  uncertainty  in  a  single  parameter 
A  -  diag  |XK|,  XK  real 
a  *  real,  zero-mean  random  parameter 
w  -  white  noise  independent  of  a  having  intensity  V 
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Note  that  for  simplicity  we  consider  only  a  single  parameter  uncertainty.  The  key 
assumption  about  the  perturbation  to  the  nominal  dynamics  is  that  it  is 
diagonalizable  with  purely  imaginary  eigenvalues,  as  in  earlier  work.  Also,  for 
simplicity  we  present  the  above  dynamics  in  the  vector  basis  diagonalizing  the 
perturbation. 

To  describe  the  minimum  data  set  arrived  at  in  [2-11],  first  note  that 
the  equation  for  the  steady-state  second-moment  matrix 


E/y, tftxx* ]  of  x  is  given  by 

o  -  AQ  +  QA*  +  H[Q]  +  V, 

(2.2) 

H[Q]  =  Eq^I  ia(Axx*  -  xx*A)], 

(2.3) 

where  we  note  that  regardless  of  the  statistics  of  or,  H.  as  defined  above  is  a 

linear  mapping  from  ^nxn  into  itself.  Normalizing  the  random  variable  a  *  a/or,  it 
turns  out  that 


\k101  -  0 

a. 

(2.4) 

11m  B  IQ]  - 

erf  oc 

T  '^k 

_  Xj'  qkj 

b. 

(2.4) 

i  r°° 

-00 

T  *  i  J  atE 

•Tk 

[cos  at] 

*  1  dtE  [cos  at) 

c. 

(2.4) 

Equation  (2.4. a)  follows  by  definition  and  (2.4.b)  governs  the  asymptotic  behavior 
of  H[Q  ]  for  large  uncertainty  levels.  The  quantity  T,  termed  the  relaxation 
time-scale,  plays  a  key  role  in  determining  the  magnitude  of  many  phenomena 
associated  with  the  impact  of  parameter  uncertainties 

Equations  (2.4)  summarize  the  required  maximum  data  set  for  parameter 
statistics.  Note  that  precisely  this  data,  assumed  available  for  the  perturbation 
iaA,  does  not,  in  fact,  imply  either:  (1)  anything  about  the  statistical 
dependence  of  the  diagonal  terms  ioA^  of  iaA, ;  or  (2)  that  iaAj^  is  a  random 
variable,  constant  in  time,  i.e.,  the  data  (2.4)  permits  each  iaAj^  to  be  a 
stationary  random  process. 
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To  be  clear  about  what,  under  data  (2.4),  is  known,  as  opposed  to  what 
is  not  known,  we  restate  the  situation  as  follows.  The  system  evolves  in  time 
according  to 


x  -  Ax  +  iXx  +  w, 

X  *  diag  [^.R(t)  ], 

A 

where  each  X  is  a  real-valued  stationary  random  process  with  zero  mean  and 

K  A 

finite  total  power  and  X  is  such  that,  with  the  normalization  constant  a 
introduced  as  above,  relations  (2.4)  hold  for  the  quantity 

H [ Q 1  £  E.  [i(^xx*  -  xx*l)]. 

—  — A,  W 

_  ,  A 

To  select  a  maximally  unpresumptive  stochastic  model  of  X  out  of  the 
infinitely  many  models  consistent  with  the  available  data,  we  now  define  a  measure 
of  a  priori  ignorance  associated  with  the  \  ' s. 

f\ 

First,  let  T^  denote  some  partition  of  the  real  line  with 

-#0  <t  <  t  <  ...  <  t  ,  and  define  the  random  variables 
o  i  m 


a‘  (2.5) 
b. 


For  a  given  T  ,  let  p(A;T  )  denote  the  joint  probability  density  of  the 

~m  A 

a  ti' ti+l)  for  all  k  and  i.  Then,  X  being  a  regular  stationary  process. 


the  totality  of  the  p(A;T  )  for  all  T 

-m  “in 

probability  structure  of  A.  Likewise, 


and  countable  m  uniquely  defines  the 
for  given  T^,  the  entropy 


S(A;T  ) 


-  J"dAp(A; 


T  )t n  p(A;T  ) 
m  — m 


(2.7) 


characterizes  the  degree  of  a  priori  ignorance  with  regard  to  the  random  variables 

VV  *!♦!>'  *-*•  •••  1  -  0*  1 . and  the  totality  of  the  £(A?Tm) 

for  all  Tm  describes  the  a  priori  ignorance  of  the  process  ^( t) .  We  now  seek  a 

probability  law  for  ^  which  maximizes  S(A?t  )  for  all  T  . 

m  m 
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Now  note  that  if  the  intervals  (t^,  t^+^l  and  (t^,  t^+^]  ate 

disjoint,  then  by  considering  the  joint  statistics  of  Av(t.,  t.  )  and 

A^t^,  it  follows  from  elementary  properties  of  the  entropy  functional 

(1177])  that  S(A;  Tm)  is  maximized  whenAR(t^,  t^+1)  andAR(tj,  t^+1) 

are  statistically  independent.  Such  independence  does  not  violate  the  presumed 

data  since  (2.4)  admits  A  's  with  independent  increments.  Thus,  s  (A;  T  ) 

K  —  — m 


is  maximized  when  each 


/  dXK(t)  i 


is  a  process  with  independent,  stationary 


increments.  However,  this  maximal  value  is  not  attained  within  the  class  of 
processes  having  finite  total  power;  it  is  only  attained  as  a  supremal  value  over 
the  class  in  the  limit  as  statistical  dependence  between  disjoint  increments 
passes  to  zero.  In  [2]  it  was  shown  that  the  correct  model  for  this  limiting 
probability  law  is  not  Ito  white  noise  but  rather  white  noise  under  the 
Stratonovich  interpretation  of  stochastic  integrals. 

Thus,  considering  joint  statistics  of  disjoint  increments  of  I  dXK(t) 


for  each  K,  the  maximal  (or  rather  supremal)  value  of  s(A;  T  )  is  attained  (for 

“TO 

all  T  )  when  all  of  the  X  ’  s  are  Stratonovich  white  noise  processes.  This 
"TO  K 

supremal  value  is  approached  without  violating  the  constraints  imposed  by 
available  data  (2.4)  and  it  remains  only  to  determine  the  incremental  covariance 


of  the  Weiner  processes 


/  <%, 


K  •  1,  ...,  N.  The  data  (2.4)  does,  in  fact. 


afford  a  unique  determination  summarized  as  follows. 


For  all  T  ,  S (A;  T  )  attains  its  supremal  value  within  the 
— m  —  m 

constraints  (2.4)  when  X(t)  evolves  according  to  the  stratonovich  white 
noise  model. 


[t  "  (A  “  2  Xtdt  +  id^t+  dWt 


(2.8) 


where  Wfc  is  a  Wiener  process  with  intensity  matrix  V  and  the 
incremental  covariance  matrix,  p,  of  the  Weiner  process  X  is  given  by 


PKj  "*  l'V  +  ,Xj'  -  >XK  -V1 


(2.9) 
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The  above  equation  for  X  is  interpreted  as  an  Ito  stochastic 


differential  equation  and  the  term  is  the  so-called  Stratonovich  correction 

where  we  introduce  the  notation: 

jM  }  =  diag  {M^,  m22,  ...,  Mnn|  (2.10) 

for  any  square  matrix  M. 

The  following  properties  of  the  above  stochastic  model  are  easily  shown 

Lemma  2.1.1 


Consider  the  stochastic  system  given  by  (2.8),  (2.9)  and  accompanying 
definitions.  Then 

A 

a.  p  is  nonnegative  definite.  (Thus  A.  is  a  well-defined  Wiener 
process. ) 

b.  The  steady-state  second  moment  matrix  satisfies 

0  ■  ( A  -  ~  j  p  |  )Q  +  Q(A  -  -|jp})*  +  P©  Q  +  V,  (2.11) 

where  ©denotes  the  Hadamard  (element-by-element)  product  of  two  matrices. 
Alternately,  (2.11)  can  be  written  as 

0  “  AQ  +  QA*  +  H[Q]  +  V 

S*i  •  4  iak  -  V 

(Thus,  the  constraints  (2.4)  are  identically  satisfied.) 

2.2  Maximum  Entropy  Model  Under  Parameter  Bounds 

In  this  section  we  briefly  outline  the  main  results  of  [34).  Consider 
the  linear  system 

xa  *  (A  +  aA)Xa  +  w,  (2.13) 


a. 

(2.12) 

b. 
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where  the  notation  "xa*  emphasizes  the  dependence  of  the  state  on  the  imperfectly 
known  parameter  a.  Here  we  assume,  as  is  often  the  case  in  practice,  that  the 
only  available  information  concerning  a  is  bounds,  which  for  present  purposes  are 
of  the  form 


a  e  (-<T,cr]  (2.14) 

The  equation  for  the  second  moment  Qa  for  xa  conditioned  on  ait)  has 

the  form 


Q«  3  AQa  +  ^A*  +  a]  +  V  (2.15) 

The  problem  we  pursue  is  the  following.  Assuming  only  bounds  on  the 
stochastic  modification  term  M  in  [2.15]  of  the  form: 

'a  lv  “i,j ISor|  +  V  W’  l2-16> 

determine  a  realization  of  a  which  maximizes  the  parameter  entropy. 

Lemma  2.2.1 


The  induced  maximum  entropy  model  for  the  second  moment  Q  subject  to 
( 2. 16)  is  given  by 


Q  =*  AQ  +  QA*  +  Hq  [Q]  +  V  (2.17) 

%  tQl)Kj  =  -^K  +  V  °Kj 

The  usefulness  of  Lemma  2.2.1  lies  in  the  fact  that  it  leads  to  the 
following  extended  optimization  problem.  Determine  feedback  compensation  gains 
which  minimize  (interpret  all  notation  as  closed-loop;  see  [9]) 

J  »  tr  [QR]  -  tr  [PV] 
e 

subject  to 
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0  =  AQ  +  QA*  +  [Q]  +  V, 

0  =  A  *P  +  PA  +  H  [P]  +  R, 
~ P 

where 


v  =  V  +  V  [Q], 

£  =  R  +  R  [P], 

V  [Q]  i  diag  j  £2cr|XK  +  Xp*l 
K 

R  [P]  =  diag 
K 


1  1 1 
"V  'I 


The  main  result,  which  follows,  shows  that  under  the  formulation  of  the 
extended  problem,  the  cost  can  be  bounded  from  above  (i.e.,  guaranteed)  over  the 
assumed  parameter  range. 


Theorem  2.2.1 


If  Q,  P  solve  the  extended  problem,  then 

Q  >  Q,  P  >  P, 

for  all  P  and  Q  solving  the  original  problem,  and 


J  (Q,  P)  >  J  (a) . 
e  e 

Furthermore,  A  +  aX  is  stable  for  all  ae[-cr,  cr]. 

The  details  of  the  above  formulation  together  with  the  proof  of  the 
above  theorem  are  to  appear  in  [34].  For  the  present,  we  note  that  this 
formulation  opens  the  way  to  a  stochastic  treatment  which  guarantees  stability 
over  the  stipulated  ranges  of  parameter  variations. 
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THE  OPTIMAL  PROJECTION  APPROACH  TO  REDUCED  ORDER  DESIGN 


3. 1  Review  of  Optimal  Projection  Results 

Optimal  projection  design  is  based  on  a  series  of  three  results  which 

characterize  the  quadratically  optimal  reduced-order  model,  reduced-order  state 

estimator  and  reduced-order  dynamic  compensator.  Assuming  a  purely  dynamic  linear 

structure  for  the  desired  system  (model,  estimator,  or  compensator),  whose  order 

is  determined  by  implementation  constraints,  a  parameter  optimization  approach  is 

taken.  There  is,  of  course,  nothing  novel  about  this  approach  per  se  and  it  has 

been  widely  studied  in  the  model  reduction  and  control  literature 

([38,40,46,72-98]).  This  approach,  however,  fell  into  disrepute  because  of  the 

extreme  complexity  of  the  grossly  unwieldy  first-order  necessary  conditions  which 

afforded  little  insight  and  engendered  brute-force  gradient  search  techniques. 

The  crucial  discovery  occurred  in  [7]  where  it  was  revealed  that  the  necessary 

conditions  for  the  dynamic-compensation  problem  give  rise  to  the  definition  of  an 

optimal  projection  as  a  rigorous,  unassailable  consequence  of  quadratic  optimality 

without  recourse  to  ad  hoc  methods  as  in  [99-102].  Exploitation  of  this 

projection  leads  to  immense  simplification  of  the  •primitive*  form  of  the 

necessary  conditions  for  each  of  th<-  three  problems.  As  summarized  in  Figure 

3.1-1,  the  modelling,  estimation,  and  compensation  design  equations  form  a  natural 

progression:  coupled  systems  of  two,  three,  and  four  matrix  equations  whose 

solutions  determine  the  desired  gains  (A  ,B  ,C  ),  (A  .B  .C  ),  and 

m  m  m  e'  e  e 

(Ac,Bc,Cc).  The  novel  equations  are  the  modified  Lyapunov  <*quations  for  the 
reduced-order  modelling  problem,  versions  of  which  arise  in  the  estimation  and 
compensation  problems,  since  the  modified  Riccati  equations  encompass  the 
standard  observer  and  regulator  Riccati  equations,  the  optimal  projection 
equations  for  the  reduced-order  state-estimation  and  dynamic-compensation  problems 
provide  a  fundamental  generalization  of  steady-state  Kalman  filter  and  LQG  theory. 

3.1.1  Optimal  Model  Reduction 

The  optimal  model-reduction  problem  (Figure  3. 1.1-1)  involves 
determining  a  low-order  model  that  minimizes  the  steady-state,  quadratically 
weighted  output  error  when  the  original  system  and  reduced-order  model  are 
subjected  to  white  noise  inputs.  This  problem  formulation  is  particularly 
appropriate  when  the  reduced-order  model  is  used  to  simulate  the  statistical 
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REDUCED-ORDER 
DYNAMIC  COMPENSATION 


2  Riccati 
2  Lyapunov 


GIVEN:  xeRn 


STEADY  STATE  TRACKING  CRITERION 

-%f(Ar.  Br.  Cr)  =  lim  E((y-yr)TR(y-yr)l  (R  positive  definite) 

t-* 

ASSUME:  A,  Ar  STABLE 

(Ar,  Br  Cr)  MINIMAL 


Figure  3. 1.1-1.  Optimal  Model-Reduction  Problem 
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response  of  the  high-order  system;  no  claim  whatsoever  is  made  as  to  its 

usefulness  for  estimator  or  controller  design.  The  main  result  (Figures  3. 1.1-2 

and  3. 1.1-3)  involves  a  coupled  system  of  two  nxn  modified  Lyapunov  equations 

whose  solutions  are  given  by  a  pair  of  rank-n  controllability  and  observability 
A  a  r 

pseudogramians  Q  and  P.  The  matrix  T coupling  these  equations  is  idempotent  since 

r2  -  g’Wt  *  gti  r  -  T. 

n 

This  oblique  projection  determines  the  optimal  reduced-order  model  via  an 
aggregation  as  a  direct  consequence  of  optimality. 


Since  the  optimal  projection  equations  for  model  reduction  are  first- 
order  necessary  conditions  for  optimality,  they  may  possess  nonunique  solutions 
corresponding  to  multiple  local  extrema  (Figure  3. 1.1-4).  The  mechanism 
responsible  for  this  phenomenon  becomes  clear  upon  characterizing  the  optimal 
projection  as  a  sum  of  rank-1  eigenpro jections  of  the  product  of  the  solutions  of 
an  equivalent  system  of  "standard*  Lyapunov  equations  (Figures  3. 1.1-5  and 
3.1. 1-6):  The  first-order  necessary  conditions  are  ambiguous  in  the  sense  that 
they  fail  to  specify  which  nf  eigenprojections  comprise  the  optimal  projection 
corresponding  to  a  solution  (i.e.,  global  minimum)  of  the  optimal  model-reduction 
problem.  Specifically,  since  the  pseudogramians  Q  and  P  can  be  rank  deficient  in 


fn  \  n*. 

n  j  *  '  n  1 (n-n  )  '  waYs»  there  may  be  precisely  this  many  "extremal"  projections 
corresponding  to  an  identical  number  of  local  extrema. 


An  immediate  consequence  of  this  insight  is  a  rigorous  extremality 
context  for  Moore's  balancing  method  ([50])  thereby  demonstrating  its  quadratic 
nonoptimality.  Specifically,  examples  can  easily  be  constructed  for  which  the 
reduced-order  model  obtained  from  the  balancing  method  is  much  worse  with  respect 
to  the  least-squares  criterion  than  the  quadratically  optimal  reduced-order  model 
(Figure  3. 1.1-7).  In  general,  all  that  can  be  said  is  that  the  presence  of  a  weak 
subsystem  (in  the  sense  of  Moore)  indicates  that  the  reduced-order  model  obtained 
by  truncation  in  the  balanced  basis  may  be  in  the  proximity  of  an  extremal  of  the 
optimal  model-reduction  problem;  however,  this  extremal  may  very  well  be  a  local 
(or  even  global)  maximum. 


5025A/DPS 


3-4 


LEMMA.  IF  Q  AND  P  ARE  NONNEGATIVE  DEFINITE  THEN  THE  PRODUCT 
QP  IS  NONNEGATIVE  SEMISIMPLE.  HENCE  IF  RANK  QP  =  nr  THEN  THERE 
EXIST  G.  T e  Rn,*n  AND  POSITIVE  SEMISIMPLE  MeRnrXn'  SUCH  THAT 


QP  =  GtMP 

!'GT  • 


Figure  3. 1.1-2.  Factorization  Lemma 
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IF  ADMISSIBLE  (Ar,Bf,Cf)  IS  OPTIMAL  THEN  THERE  EXIST  n  x  n 
NONNEGATIVE-DEFINITE  MATRICES  Q  AND  P  SUCH  THAT.  FOR  SOME 
(G.M.F)-FACTORIZATION  OF  QP.  (Ar,Bf,Cr)  ARE  GIVEN  BY 

Ar  =  TAG1. 

Br  =  TB. 

Cf  =  CGT. 

AND  SUCH  THAT,  WITH  r  5  GtT  AND  r±  =  ln-r. 

O  =  AQ  +  QAt  +  BVBt  —  r^BVB7^, 

O  =  AtP  +  PA  +  CtRC  —  Tj_CTRCr_L, 

RANK  Q  =  RANK  P  =  RANK  QP  =  nr. 


Figure  3. 1.1-3.  Optimal  Reduced-Order  Model 
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A  =  diag(-a-|,...,-an),  aj>0 

•  BBT  =  diag(/3-|,...,/?n),  CTC  =  diag(n,...,yn),  j3j,yjX) 
Q  —  diag(Qi,...,Qn),  P  ~  diag(P^,...,Pj|) 

^  P\  ^  7i 

Qi=2^5i.pi  =  2^«i,«i  =  0or1 

But:  Only  nr  of  the  Sj’s  are  1 

(n  \  = - - -  possible  extrema 

\nr/  nr!  (n-nr)l 


Pigure  3. 1.1-4.  Existence  of  Multiple  Extrema  in  Optimal  Model  Reduction 
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Q,  P  nonnegative  definite  =>  QP  nonnegative  semisimple 


0 

•  QP  =  S  S"1 , 

0  An 


=  S(X  XjEiJS-1, 
i=1 

n 

=  *  XjniEQP], 
i=1 


\j  >0 


Ei  = 


•0i, 


■0 


rij[QP]  =  SEjS-1 


O  T  =  ±  IIj.[QP] 


Figure  3. 1.1-5.  obtaining  an  Oblique  Projection  From  an  Eigenprojection 
Decomposition  of  a  Semisimple  Matrix 
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fL  -  •_**»»»•*»  *  -  • .  *  »  r  k  *  v 

v\v\>\>V-V' -A' sC>yA&>VvA- -v*y ' 


tAt)Q  +  Q(A  -  tAt)t  +  BVBt 
^Ar)TP  +  P(A  -  tAt)  +  CTRC 


r=  2nir 
r=1  ' 


[QP] 


Pigure  3. 1.1-6.  Optimal  Projection  Equations  for  Model  Reduction 
Standard  Lyapunov  Equation  Form 
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n  =  2,  «i  =  1,  «2  =  106,  £i  =  1,  02  =  106, 

71  =  1.  72  =  103 

■=>  second-order  modes  ai  =  .5,  02  *=  .012 


Moore’s  approach  ■=>  truncate  X2  ■=>  J  =  500 
However:  truncate  x-j  ■=>  J  =  .5 


This  attainment  of  a  global  maximum 
is  a  direct  result  of  incorrect  selection 
of  nr  eigenprojections  in  constructing 
the  optimal  projection 


Figure  3. 1.1-7. 


inadequacy  of  the  Singular  Values  in  Quadratically  Optimal 
Model  Reduction 
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Although  the  optimal  projection  equations  characterize  all  extrema,  it 
is  desirable  for  theoretical  and  practical  reasons  to  directly  characterize  the 
global  minimum.  One  approach,  investigated  in  [29],  involves  decomposing  the  cost 

as 


J 


i*l 


V 


where  each  corresponds  to  a  particular  eigenprojection.  This  technique, 
which  is  reminiscent  of  Skelton's  Component  Cost  Analysis  [52,101],  permits  rapid 
sorting  of  the  local  extrema  by  directing  the  algorithm  to  the  global  optimum 
( Figure  3.1. 1-8) . 

3.1.2  Optimal  Reduced-Order  Dynamic  Compensation 

Virtually  all  research  into  the  design  of  reduced-order  controllers 
involves  one  of  two  sequential  procedures:  model  reduction  followed  by  controller 
design  or  controller  design  followed  by  controller  reduction  (Figure  3. 1.2-1). 

The  optimal  projection  equations  represent  a  radical  departure  from  both  of  these 
approaches  by  directly  characterizing  the  quadratically  optimal  reduced-order 
controller  for  a  high-order  model.  The  form  of  the  necessary  conditions  (a 
coupled  system  of  two  modified  Riccati  equations  and  two  modified  Lyapunov 
equations  given  by  equations  (2. 18)- (2.21)  of  [23]  in  Appendix  B)  indicates  the 
essential  presence  of  the  LQG  and  model-reduction  operations.  The  coupling  by 
means  of  the  projection,  however,  reveals  the  inherent  inseparability  of  these 
operations  in  the  reduced-order  case  and  represents  a  graphic  portrayal  of  the 
demise  of  the  classical  separation  principle.  Hence,  optimality  considerations 
demand  that,  in  a  very  precise  sense,  reduction  and  control  design  be  performed 
simultaneously. 

The  optimal  projection  equations  also  reveal  the  suboptimality  of 
ad  hoc  controller  reduction  methods.  As  shown  in  [19]  (Appendix  A),  several 
reduction  methods  involve  equations  similar  to  the  optimal  projection  equations 
but  lack  crucial  coupling  terms.  Prior  to  the  discovery  of  the  optimal  projection 
equations,  the  state  of  affairs  in  reduced-order  controller  design  was 
philosophically  analogous  to  fluid  mechanics  should  it  have  developed  without 
benefit  of  the  Navier-Stokes  equations:  Instead  of  deciding  which  of  the  numerous 
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Figure  3. 1.1-8.  Computational  Algorithm  for  Optimal  Model  Reduction 
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Figure  3. 1.2-1.  Optimal  Projection  Design  for  Reduced-Order  Dynamic 

Compensation 
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terms  are  small  and  hence  can  be  neglected,  engineers  would  have  been  obliged  to 
laboriously  construct  the  important  terms! 

3*1. 2.1  Optimal  Projection  Controller  Design  for  Draper  Model  <2 

The  optimal  projection  approach  was  applied  to  a  20-state  version  of 
the  CSDL  Model  #2  (see  Figure  3.1.2. 1-1  [19])  excluding  modelling  uncertainties. 
This  corresponds  to  Task  2. A  mentioned  in  Section  1.0,  i.e.,  the  "high  authority" 
control  design. 

This  example  was  used  to  compare  both  theoretically  and  numerically  the 
optimal  projection  approach  with  a  variety  of  suboptimal  controller-order 
reduction  methods.  The  theoretical  comparison  shows  that  all  current  suboptimal 
techniques  essentially  define  a  (suboptimal)  projection  characterizing  the 
reduced-order  compensator.  In  contrast,  the  optimal  projection  design  equations 
define  the  needed  projection  by  rigorous  application  of  optimality  principles. 
Moreover,  all  the  approaches  considered  in  [19]  can  be  displayed  in  a  common 
notation,  and  this  graphically  reveals  the  suboptimal  design  equations  as  special 
cases  of  or  approximations  to  the  optimal  projection  equations. 

For  numerical  comparison  it  is  standard  procedure  to  plot  the 

T  T 

regulation  cost  E[x  R^]  as  a  function  of  control  cost  E[u  R2u].  Results 

for  these  tradeoff  curves  are  shown  in  Figure  3.1.2. 1-2.  The  very  bottommost 

curve  represents  the  full-order,  LQG  design.  Since  this  is  the  best  obtainable 

when  there  is  no  restriction  on  compensator  order,  the  problem  is  obtaining  a 

lower  order  design  whose  tradeoff  curve  is  as  close  to  the  LQG  results  as  possible. 

The  thin  black  lines  in  Figure  3.1.2. 1-2  show  the  nc  *  10,  6,  and  4 

designs  obtained  via  Component  Cost  Analysis  [101],  where  n  denotes  the 

c 

compensator  dimension.  This  appears  to  be  the  most  successful  suboptimal  method 
applied  to  the  example  problem  considered  here.  Note  that  the  10th  and  6th  order 
compensator  designs  are  quite  good,  but  when  compensator  order  is  sufficiently  low 
(nc  =  4)  and  controller  bandwidth  sufficiently  large  (R<5.0),  the  method  fails 
to  yield  stable  designs.  This  difficulty  is  characteristic  of  all  suboptimal 
techniques  surveyed,  and,  in  fairness,  it  should  be  noted  that  most  other 
suboptimal  design  methods  fail  to  give  stable  designs  for  compensator  orders  below 
10. 
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In  contrast,  the  width  of  the  grey  line  in  Figure  3. 1.2. 1-2  encompasses 
all  the  optimal  projection  results  for  compensators  of  orders  10,  6,  and  4. 

3.1.3  Review  of  Infinite-Dimensional  Results 


The  optimal  projection  equations  for  fixed-order  dynamic  compensation 
have  been  generalized  in  [28]  (Appendix  D)  to  the  case  in  which  the  controlled 
system  is  an  infinite-dimensional  system  in  Hilbert  space.  Mathematically,  the 
dynamics  operator  A  is  assumed  to  generate  a  CQ  semigroup,  the  stochastic 
differential  equation  is  treated  as  an  evolution  equation,  white  noise  is 
interpreted  in  the  sense  of  Balakrishnan  ([134])  and  the  input  and  output 
operators  B  and  C  are  assumed  to  be  bounded. 

In  addition  to  the  above  mathematical  considerations  associated  with 
the  physical  description  of  the  distributed  parameter  system,  optimal  projection 
design  directly  addresses  the  practical  constraints  according  to  Athans  ([130]) 
and  Balas  ([131])  of:  1)  finitely  many  sensors  and  actuators,  2)  a 
finite-dimensional,  controller,  and  3)  natural  system  dissipation.  The  validity 
of  2)  is  apparent  from  the  fact  that  processing  and  transmitting  electrical  signals 
by  conventional  analog  or  digital  components  constitutes  finite-dimensional  action. 
Hence,  although  distributed  parameter  systems  are  most  accurately  represented  by 
infinite-dimensional  models,  real-world  considerations  demand  that  implementable 
controllers  be  modelled  as  finite-dimensional  lumped  parameter  systems. 

Clearly,  the  above  observations  effectively  preclude  the  possibility  of 
realizing  infinite-dimensional  controllers  that  involve  full-state  feedback  or 
full-state  estimation  (see,  e.g.,  [132-134]  and  the  numerous  references  therein). 
Although  the  transition  to  finite-dimensional,  implementable  controllers  proceeds 
by  means  of  approximation  schemes,  these  results  only  guarantee  optimality  in  the 
limit,  i.e.,  as  the  order  of  the  approximating  controller  increases  without  bound 
([135-138]).  Hence,  there  is  no  guarantee  that  a  particular  approximate  (i.e., 
discretized)  controller  is  actually  optimal  over  the  class  of  approximate 
controllers  of  a  given  order  dictated  by  implementation  constraints.  Moreover, 
even  if  an  optimal  approximate  finite-dimensional  controller  could  be  obtained,  it 
would  almost  certainly  be  suboptimal  in  the  class  of  all  controllers  of  the  given 
order. 
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Although  the  sequence  of  operations  described  above  corresponds  to  the 
left-hand  branch  of  Figure  3. 1.3-1,  one  can  alternatively  follow  the  right-hand 
branch,  i.e.,  replace  the  distributed  parameter  system  with  a  high-order 
finite-dimensional  model  and  utilize  the  optimal  projection  equations  of  [23] 
(Appendix  B)  to  obtain  a  fixed-order  controller.  The  most  direct  route,  however, 
involves  a  direct  characterization  of  the  optimal  finite-dimensional,  fixed-order 
controller  for  the  original  distributed  parameter  system.  The  resulting  equations 
are  exactly  analogous  to  the  optimal  projection  equations  obtained  in  [23]  for  the 
finite-dimensional  case.  Instead  of  a  system  of  four  matrix  equations,  however, 
the  result  now  involves  a  system  of  four  operator  equations  whose  solutions 
characterize  the  optimal  finite-dimensional  fixed-order  dynamic  compensator  (see 
(3.9)— (3. 18)  of  [28]  in  Appendix  D).  Moreover,  the  optimal  projection  now  becomes 
a  bounded  idempotent  Hilbert-space  operator  whose  rank  is  precisely  equal  to  the 
order  of  the  compensator.  To  illustrate  the  beauty  and  simplicity  of  this  result, 
note,  for  example,  that  the  ncxnc  dynamics  matrix  Afi  of  the  controller  is  given 
by  (see  (3.9)  of  [28]) 

A  *  T(A  -  QBR^B*  -  C*V~1CP)G* 

C  2  2 

•  n 

where  G  is  a  mapping  from  R  into  the  domain  of  A  and  r is  a  mapping  from 

nc 

the  Hilbert  space  into  R  .  Hence,  the  above  expression  is  indeed  a  valid 

representation  of  an  n  xn  matrix  which,  most  interestingly,  incorporates  an 

c  c 

internal  model  of  the  full  dynamics  operator  of  the  infinite-dimensional  system! 

Since  the  only  explicit  assumption  on  the  unbounded  dynamics  operator 
is  that  it  generate  a  strongly  continuous  semigroup,  these  results  are  potentially 
applicable  to  a  broad  range  of  specific  partial  and  functional  differential 
equations.  Their  actual  applicability  is  essentially  limited  by  practical 
constraint  3).  Because  of  the  steady-state  problem  setting,  it  is  implicitly 
assumed  that  the  distributed  parameter  system  is  stabilizable,  i.e.,  that  there 
exists  a  dynamic  compensator  of  a  given  order  such  that  the  closed-loop  system  is 
uniformly  stable.  The  stabilization  problem  has  been  considered  in  [142-148]  for 
delay,  parabolic,  and  damped  hyperbolic  systems. 
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OPTIMAL  PROJECTION/MAXIMUM  ENTROPY  DESIGN  SYNTHESIS 


4.0 


OPTIMAL  PROJECTION/MAXIMUM  ENTROPY  DESIGN  SYNTHESIS 


4. 1  Design  Equations 

By  combining  maximum  entropy  stochastic  modelling  with  optimal 
projection  design,  we  obtain  a  powerful  system  design  methodology  which 
generalizes  LQG  theory  in  two  fundamental  respects:  design  of  reduced-order 
controllers  plus  accommodation  of  a  priori  parameter  uncertainties.  The  most 
general  results  obtained  thus  far  apply  to  the  modelling,  estimation,  and  control 
problems  and  are  presented  in  detail  in  [31]  (see  Appendix  E).  For  the  control 
problem,  these  results  are  summarized  in  Figures  4.1-1  to  4.1-4. 

The  control-design  problem  is  summarized  in  Figure  4.1-1  and  reveals 

the  requirement  of  a  low-order  controller  of  fixed  dimension  and  the  technique  of 

using  Stratonovich  white  noise  to  model  parameter  uncertainties.  Figure  4.1-2 

summarizes  the  stability  conditions  under  which  the  optimization  is  carried  out. 

The  optimal  controller  gains  A  ,  B  and  C  are  given  in  Figure  4.1-3  in 
/\  c  c  c 

terms  of  Q,  P,  Q  and  P  which  are,  in  turn,  determined  by  a  coupled  system  of  two 
modified  Riccati  equations  and  two  modified  Lyapunov  equations  (shown  in  Figure 
4.1-4). 


4.2  Combined  OP /ME  Design  for  the  NASA  SCOLE  Model 

Harris  GASD  recently  completed  a  NASA/LaRC  supported  study  on  the 
Spacecraft  Control  Laboratory  Experiment  (SCOLE)  configuration  shown  in  Figure 
4.2-1  which  is  the  subject  of  the  NASA/IEEE  Design  Challenge.  Full  details  of  our 
model  and  design  results  are  given  in  [27], 

A  high-order  finite  element  model  was  constructed  for  SCOLE,  treating 
the  shuttle  and  reflector  as  rigid  bodies  and  the  connecting  mast  as  a  classical 
beam  with  torsional  stiffness.  This  model  includes  the  Shuttle  products-of- 
inertia  and  the  offset  between  reflector  center-of-mass  and  its  attachment  point 
on  the  mast.  The  quadratic  performance  penalty  on  the  system  state  is  simply  the 
total  mean  square  line-of-sight  error. 

As  part  of  the  study,  we  considered  a  system  model  including  the  first 
eight  modes  and  (1)  performed  LQC  studies  to  select  the  control  authority  and 
establish  a  baseline  and  (2)  designed  full-order  (16-state)  compensators  with  a 
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HIGH-ORDER,  UNCERTAIN  PLANT 


I  PERFORMANCE  CRITERION 

iSc  J(Ac,Bc,Cc)  =  llm  E[xtR1x  +  2xTR12u  +  uTR2u] 

ov  1—00 

^Technical  Assumption:  B|  5*  0  =>  Cj  =  0 


v, 

5 


V*  i 


I 


Figure  4.1-1.  Steady-State  Reduced-Order  Dynamic-Compensation 
Problem  with  Parameter  Uncertainties 
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Nominal  Closed-Loop  Dynamics  Matrix 


Uncertainty  Due  to  Ith  Uncertain  Parameter 


Corrected  Dynamics  Matrix 


A  = 


*1 

BcC| 


B|Cc 


0 


STABILITY  IS  DETERMINED  BY 


As  ®  *n+nc  +  *n+nc  ®  As  + 


?  As  ®  Ai 
i=1  1  1 


®  =  Kronecker  product 


Figure  4.1-2.  Second-Moment  stability 
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CONTROLLER  GAINS  (Functions  of  Q,  P,  Q,  P) 

Ac  =  Cs)GT 

Bc  =  r0sv2» 

Cc  =  -niV.Q7 

NOTATION 

QP  =  GTMI\  TGT  =  ln  ( <=>  T  =  GTr  =r2) 

c 

p  p 

AQAt=  SAiQaT,  aQB  =  XAiQBi,  etc. 

1=1  1  i=1 1 

As  =  A  +  J*2  Bg  =  B  +  ^AB  C8  =  C  +  ^CA 

R28  =  R2  +  BT(P+P)B  v2s  =  V2  +  C(Q+Q)CT 

Qs  =  QC  g  +  V12  +  A(Q+Q)CT  ps  =  B  gP  +  R^2  +  BT(P+P)A 


Figure  4.1-3.  Controller  Gains 
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SOLVE  FOR  NONNEGATIVE-DEFINITE  Q,  P,  Q,  P 


0  =  AgQ  +  QAg  +  AQAT  +  V,  +  (ABR^P#)Q(A-BR'^P#)T  -  O.VjgOj  + 


0  =  AJ%P  +  PAg  +  ATPA  ♦  R1  +  (A-0fV21sL')TP(A-0IV'21IL)  -  pJr^P*  +  tJ  (’"Jr^P,^ 
0  =  (A#-BfR-Jipg)Q  +  Q(Ag-BgR-21gPg)T  +  0,v£ *0^ 


A  A 


O  =  (*,-0,Vj,,Ci)TP  *  PlA.-O.V  j.C.)  •P'.Rj.V  r-  P'^iV, 


1  ~  ’  *  Tnil’  -  rTl‘lRil’.r. 


A  A  AA 

RANK  Q  =  RANK  P  =  RANK  QP  =  n. 


AA  AA  aa 

QP(QP)# 


ri  =  ‘n  -  r 


#  ■=£>  GROUP  GENERALIZED  INVERSE 


Figure  4.1-4.  Optimal  Projection/Maximum  Entropy  Design  Equations 
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maximum  entropy  model  of  modal  frequency  uncertainties.  The  maximum  entropy  model 
assumed  that  all  elastic  mode  frequencies  were  subjected  to  independent  variations 
(due  to  modelling  error)  of  +<r  to  -cr  relative  to  their  nominal  values.  Thus  the 
positive  number  cr  denotes  the  overall  fractional  uncertainty. 

Although  robust  stability  is  obtained  under  these  independent  and 
simultaneous  variations/  the  robustness  properties  of  specific  designs  are  simply 
illustrated  here  by  looking  at  the  variation  of  performance  and  closed-loop  poles 
when  all  modal  frequencies  are  varied  by  the  same  fractional  change  from  the 
nominal  values.  In  other  words,  we  interconnect  a  given  controller  design  (be  it 
LQG  or  maximum  entropy)  with  a  perturbed  plant  model  wherein  all  modal  frequencies 
are  changed  by  Sx  (nominal  values)  and  evaluate  the  closed-loop  performance  and 
pole  locations.  This  is  repeated  for  a  range  of  values  of  S. 

Figure  4.2-2  shows  how  the  pole  locations  for  an  LQG  design  wander 
under  a  +5  percent  variation  of  the  modal  frequencies.  It  is  seen  that  two  of  the 
pole  pairs  are  particularly  sensitive  and  are  nearly  driven  unstable  by  only  this 
+5  percent  variation.  This  happens  because  the  associated  structural  modes 
contribute  little  to  performance  and  the  LQG  design  attempts  a  "cheap  control" 
(small  regulator  and  observer  gains)  by  placing  compensator  poles  very  close  to 
the  open-loop  plant  poles.  For  nominal  values,  this  scheme  achieves  significant 
shifts  of  open-loop  poles  with  very  small  gains,  but  it  is  highly  sensitive  to 
off-nominal  perturbations. 

Figure  4.2-3  shows  closed-loop  poles  for  the  same  conditions  except 
that  a  maximum  entropy  compensator  design  with  cr =  0.1  (10  percent  variation 
modelled)  was  utilized.  In  contrast  with  Figure  4.2-2,  the  maximum  entropy  design 
makes  the  compensator  poles  "stand-off"  deeper  in  the  left  half-plane.  (This  is  a 
direct  consequence  of  the  Stratonovich  correction.)  Consequently,  the  strong  and 
sensitive  interactions  noted  above  are  entirely  eliminated.  The  poles  associated 
with  higher-order  structural  modes  are  seen  to  vary  only  along  the  imaginary  axis 
and  are  not  destabilized. 
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Figure  4.2-3.  SIGMA  ■  0.1  Poles 
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IMA6  AXIS  (1/SEC) 


Figure  4.2-4  illustrates  how  the  total  performance  index  for  given 
controller  designs  varies  as  the  structural  mode  frequencies  are  perturbed 
relative  to  their  nominal  values.  The  LQG  design  (which  is  simply  a  maximum 
entropy  design  for  cr=  0)  becomes  unstable  for  >  7  percent  and  <  -14  percent 
variations.  In  contrast  and  even  with  a  modest  10  percent  level  of  modelled 
uncertainty,  the  maximum  entropy  designs  completely  eliminate  the  sensitivity. 
Note  that  within  the  parameter  range  for  which  LQG  is  stable,  the  or =  0.1  maximum 
entropy  design  experiences  only  12-15  percent  degradation.  Of  course,  over  the 
regions  for  which  LQG  is  unstable,  the  maximum  entropy  designs  are  qualitatively 
superior. 


These  results  serve  to  illustrate  a  general  fact:  By  incorporating 
parameter  uncertainty  as  an  intrinsic  facet  of  the  basic  design  model,  the  maximum 
entropy  formulation  is  able  to  secure  high  levels  of  robustness  with  little 
degradation  of  nominal  performance. 

Finally,  the  combined  OP /ME  design  capability  was  exercised,  taking  the 
16-state  maximum  entropy  compensator  design  with  cr  =  0.10  frequency  uncertainty 
level  as  the  starting  point.  Reduced  order  compensator  designs  were  constructed 
for  compensators  of  order  14,  12,  10,  8,  6,  and  5.  Figure  4.2-5  shows  the 
tradeoff  between  performance  (total,  closed-loop  performance  index  evaluated  for 
nominal  values  of  modal  frequencies)  and  controller  dimension.  The  figure  clearly 
shows  that  performance  degradation  for  compensator  orders  above  6  is  negligible. 
The  6th  order  controller  sacrifices  only  3  percent  of  the  performance  of  the 
full-order  (16-state)  controller.  This  would  seem  to  be  acceptable  in  view  of  the 
better  than  sixfold  decrease  in  implementation  costs  (e.g.,  flops  required  in 
matrix  multiplication)  which  results  from  order  reduction. 

In  conclusion,  these  results,  together  with  much  additional  material 
included  in  (27],  demonstrate  automated  solution  of  the  full  OP /ME  design 
equations  (shown  in  Figure  4.1-4)  and  illustrate  the  performance  and 
implementation  benefits  to  be  expected  under  this  unified  approach. 
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Figure  4.2-5.  Cost  Versus  Complexity 
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Abstract 

Several  suboptimal  approaches  to  reduced-order 
controller  design  are  compared  with  the  new  optimal 
projection  formulation  of  the  quadratical ly  optimal 
fixed-form  compensator  problem.  The  substantial 
similarities  and  significant  differences  among  the 
various  design  techniques  are  highlighted  by 
placing  the  design  equations  of  all  methods  within 
a  common  notation.  Basically,  all  methods  charac¬ 
terize  the  reduced-order  controller  by  a  projection 
on  the  full  state  space.  The  suboptimal  methods 
construct  this  projection  on  the  basis  of  balancing 
considerations  while  the  optimal  projection  equa¬ 
tions  define  it  as  a  consequence  of  optimality  con¬ 
ditions.  Issues  relating  to  relative  computational 
simplicity  and  design  reliability  are  explored  by 
applying  two  of  the  methods  to  the  same  example 
problem. 

1 .  Introduction 

The  design  of  reduced-order  dynamic  controllers 
for  high-order  systems  is  of  considerable  impor¬ 
tance  for  applications  involving  large  spacecraft 
and  flexible  flight  systems  and  extensive  research 
has  recently  been  devoted  to  this  area.  This  paper 
reviews  and  compares,  both  theoretically  and  compu¬ 
tationally,  several  current  approaches  to  reduced- 
order  controller  design. 

One  procedure  for  addressing  this  problem  is 
to  first  apply  some  suitable  model  reduction  algo¬ 
rithm  to  reduce  the  plant  model  to  the  dimension 
desired  for  the  controller  and  then  obtain  an  LQG 
controller  which  is  optimal  for  the  reduced  model. 

A  second,  and  perhaps  more  satisfactory  approach  is 
to  predicate  the  control  design  upon  a  higher  order 
model  and  then  to  reduce  the  dimension  of  the  con¬ 
trol  ler .  P*  course,  this  technique  presupposes 
that  some  form  of  model  reduction  is  still  employed 
to  reduce  the  originally  very  high  order  plant 
model  to  a  "Riccati-solvable"  dimension. 

Because  they  all  reflect  the  latter  "control¬ 
ler  reduction”  philosophy  and  exhibit  significant 
similarities,  we  confine  attention  here  to  the 
following  methods: 

1)  Balanced  Controll"-  Reduction  Alto^ithm  (BCRA) 

--This  is  the  intti.ial  balancing  model  reduc¬ 
tion  approach  of  Moore^  applied  to  the  con¬ 
troller  reduction  problem. 

2)  Balanced  Controller  Reduction  Algorithm  - 

Modified  (BCRAM) 

--A  modification  of  BCRA  by  Yousuff  and 
Skelton.’ 

3)  Component  Cost  Algorithm  (CCA) 

--An  application  of  Component  Cost  Analysis  to 
this  problem  by  Yousuff  and  Skelton. ^ 
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A)  Optimal  Projection  Conditions  (OPC) 

--These  design  equations  are  actually  the  nec¬ 
essary  stationarity  conditions  for  quadrati¬ 
cal  ly  optimal,  fixed-order  compensator  design 
in  the  form  originally  derived  in  Reference  4. 
Subsequently,  the  derivation  was  significantly 
improved  and  strengthened  by  Hyland  and 
Bernstein. 5 

5)  Approximate  Optimal  Projection  Conditions  — 
nth  iterate  (A0PCn) 

--This  denotes  the  iterative  algorithm,  termi¬ 
nated  at  the  nth  iterate,  for  solution  of  the 
OPC  as  described  in  Reference  6.  Under  benign 
conditions,  this  algorithm  is  locally  conver¬ 
gent  so  that  A0PCn  becomes  equivalent  to  OPC 
as  n — - «®  . 

A  basic  distinction  among  the  above  methods 
should  be  noted:  (1)  -  (3)  are  admittedly  sub¬ 
optimal  approaches  based,  essentially,  upon  balanc¬ 
ing  considerations  while  formulation  (4)  and  its 
computational  implementation  (5)  arise  from  consid¬ 
eration  of  quadratical ly  optimal,  fixed-order  com¬ 
pensator  design. 

Of  course,  the  stationary  conditions  for  fixed- 
form  compensation  have  been  written  down  (see  [7- 
12  3,  for  example).  Rowever,  full  exploitation  of 
the  stationary  conditions  has  no  doubt  been  retard¬ 
ed  by  their  extreme  complexity.  What  is  lacking, 
to  quote  the  insightful  remarks  of  [11],  “is  a 
deeper  understanding  of  the  structural  coherence  of 
these  equations."  The  contribution  of  Refe.  twees 
[4]  and  [5]  was  to  show  how  the  orginally  very  com¬ 
plex  stationary  conditions  can  be  reduced,  without 
loss  of  generality,  to  much  simpler  and  more  tract¬ 
able  forms.  The  resulting  equations  preserve  the 
simple  form  of  LQG  relations  for  the  gains  in  terms 
of  covariance  and  cost  matrices,  which,  in  turn, 
are  determined  by  a  coupled  system  of  two  modified 
Riccati  equations  and  two  modified  Lyapunov  equa¬ 
tions.  This  coupling,  by  means  of  a  projection 
(idempotent  matrix)  whose  rank  is  precisely  equal 
to  the  order  of  the  compensator,  represents  a 
graphic  portrayal  of  the  demise  of  the  classical 
separation  principle. 

The  compensator  form  which  naturally  emerges 
from  this  formulation  is  fully  defined  by  the  gains 
and  by  the  projection  matrix,  whose  row  and  column 
spaces  are,  respectively,  the  observation  and  con¬ 
trol  subspaces  of  the  compensator.  In  fact,  the 
stationary  conditions  are  of  such  a  form  that  they 
determine  this  "optimal  projection"  together  with 
the  gains. 

It  must  be  emphasized  that  the  emergence  of 
such  a  compensator  projection  does  not  represent  an 
a  priori  assumption  regarding  the  controller  struc¬ 
ture  but  rather  is  a  consequence  of  the  first- 
order  necessary  conditions. 
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The  highly  structured  character  of  the  optimal 
projection  conditions  not  only  gives  rise  to  direct 
numerical  solution  procedures  (as  has  been  illus¬ 
trated  in  [6])  but  also  sheds  light  on  the  various 
suboptimal  techniques.  One  aim  of  this  paper  is  to 
elucidate  the  fundamental  connections  existing  be¬ 
tween  the  fixed-order  compensator  optimality  condi¬ 
tions  and  the  balancing  approaches  of  methods  1-3. 
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Note  that  since  method  4  above  constitutes  the 
optimality  conditions  for  fixed-order  compensation, 
it  theoretically  represents  the  "best"  controller- 
order  reduction  scheme--i.e.  it  gives  the  minimum 
(zero)  degree  of  suboptimality  and  can  thus  serve 
as  a  standard  of  comparison.  On  the  other  hand, 
solution  of  OPC  via  the  iterative  algorithm  of 
method  5  entails  more  computational  effort  than 
methods  1-3.  Thus,  the  second  major  goal  of  this 
paper  is  to  examine  the  tradeoffs  between  the 
greater  computational  simplicity  of  methods  such  as 
1-3  versus  the  possibilities  of  improved  perform¬ 
ance  and  design  reliability  offered  by  method  5. 

The  plan  of  the  paper  is  as  follows.  After 
presenting  the  general  problem  formulation,  we 
establish  a  common  notation  and  display  the  basic 
equations  of  all  methods  side  by  side  (see  Table  1 
below).  This  permits  the  various  design  approaches 
to  be  compared  quite  directly  and  introduces  con¬ 
siderable  efficiency  in  the  discussion.  After 
addressing  several  Important  theoretical  issues,  we 
finally  apply  a  selection  of  the  methods  to  a  sin¬ 
gle  numerical  example  which  involves  a  20  state 
reduced-order  version  (see  Reference  [13])  of  the 
CSDL  ACOSS  Model  No.  2.  The  theoretical  and  numer¬ 
ical  results  allow  several  consluslons  to  be  drawn 
regarding  the  comparative  efficiency  and  subopti¬ 
mality  of  the  several  design  methods.  In  particu¬ 
lar,  it  is  shown  that  with  modest  increase  In  com¬ 
putational  effort  the  optimal  projection  approach 
produces  stable  and  optimal  designs  in  cases  where¬ 
in  some  suboptimal  methods  fail  to  yield  stable 
designs. 


is  either  minimized  (subject  to  the  structural  con¬ 
straints  implicit  in  (2))  or  at  least,  rendered  as 
small  as  is  practicable.  The  challenging  aspect  of 
the  problem  is  that  in  accordance  with  practical 
implementation  constraints  associated  with  the  lim¬ 
itations  of  on-line  software,  Nc  (the  dimension  of 
the  compensator)  is  chosen  in  advance  to  be  some 
number  which  is  less  than  the  plant  dimension. 

Within  the  above  problem  formulation,  it  is  now 
possible  to  distill  all  methods  considered  here  in¬ 
to  a  common  notation.  Although  reasonably  obvious, 
the  assertions  made  below  beginning  with  equation 

(4)  and  concluding  with  equation  (16)  and  Table  1 
are  substantiated  for  methods  1-3  in  the  Appendix. 
No  additional  confirmation  is  required  for  the  op¬ 
timal  projection  equations  since  the  following 
ideas  were  explicitly  stated  for  OPC  In  [4]  and 

[5] . 

First,  it  can  be  shown  that  all  design  methods 
considered  establish  a  projection  of  rank  Nc: 

t  e  RNxN,  «  t  ,  rank  (r)  *  Nc  (4) 

which  characterizes  the  observation  and  control 
subspaces  encompassed  by  the  compensator.  More¬ 
over,  a  factorization  of  r is  always  employed  (at 
least  implicitly)  which  has  the  form: 

T  *  g  Tr  (5. a) 

where  /"and  g  are  full-rank,  Nc  x  N  matrices  satis¬ 
fying: 


2.  Problem  Statement  -  Setting 
Up  a  Common  Notation 


The  problem  addressed  concerns  the  linear, 
finite-dimensional,  time-invariant  system: 


x  *  Ax  +  Bu  +  wi ;  X<  RN 
Y  *  Cx  +  w2;  Y t  Rp 


where  x  is  the  plant  state,  A  is  the  plant  dynamics 
matrix  and  B  and  C  are  control  input  and  sensor 
output  maps,  respectively,  wi  is  a  white  distur¬ 
bance  noise  with  intensity  matrix  Vi  0  and  w2  is 
observation  noise  with  nonsingular  intensity  V2>0. 


The  problem  addressed  by  all  the  methods  listed 
above  is  to  design  a  constant  gain  dynamic  compen¬ 
sator  of  the  form: 

u  *  -Kq,  u«IR*  1 
q  *  Acq  +  FY,  q«  RMc  f 


rgT  =  grT  =  iNc  (5.b) 

Note  that  if  r  is  a  projection,  then  r and  g 
satisfying  (5.b)  always  exist  such  that  (5. a)  holds- 


Secondly,  it  can  be  shown  that  methods  1-5  all 
produce  matrices  K,  F,  Ac  of  the  form:  • 


K  <■  KgT 

f  «  rr 

Ac  '  r( A  -  FC  -  BK)gT 

where 

*  - 1  y 

K  *  R2  BT  P 
F  *  Q  CT  V2 


(6) 


(7) 


and  where  P  and  Q  are  both  symetric  and  positive 
semi-definite,  N  X  N  matrices. 


Nc  <  N 


In  summary,  all  methods  yield  the  closed-loop 
system  equations: 


such  that  the  quadratic,  steady-state  performance 
index: 


X 

q 


Ax  -  BKglq  +  w-| 
r(A  -  FC  -  BK)gTq  +rF(CX 


with  convention  (10. c). 

This  notation  together  with: 


with  K  and  F  given  by  (7).  In  other  words,  the  re¬ 
duced-order  compensator  takes  the  form  of  a  full- 
order  compensator  projected  down  to  an  Nr  -  dimen¬ 
sional  subspace.  The  fact  that  (4)  -  (8 )  hold  for 
the  various  suboptimal  design  methods  was  recently 
recognized  in  [14]. 


X  k  l  A 

Ap  A  A  -  XP,  Aq  A  A 
p  A  PIP,  q  A  Q?Q 


:Vc  (14) 

■  qX,  ac  a  a  -xp  -  qI 

(15) 


Thus,  the  principal  distinction  among  the  de¬ 
sign  methods  rests  in  the  manner  in  which  P,  Q  and 
r(or  equivalently  r and  g)  are  constructed.  To 
elucidate  this  matter,  we  first  introduce  the  lemma 
(see  [15]): 

Lemma.  Suppose  M-|  <  and  M2  f  R^xN  are  posi- 

tive  semi-definite.  Then  the  product  Ml  M1M2  is 
semi-simple  (i.e.  all  Jordan  blocks  are  of  order 
unity)  with  real,  non-negative  eigenvalues. 

For  convenience,  let  us  also  set  up  some  addi¬ 
tional  notation  relating  to  semi-simple  matrices. 

If  M  e  rNxN  -jS  semi-simple,  then,  for  some  non¬ 
singular  <t>  : 

M  -  <#>X<M  (9) 


Tl  k  h  -  r  (16) 


allows  us  to  state  the  basic  design  equations 
rather  succinctly.  Table  1  lists  the  equations 
determining  P,  Q  and  T  for  BCR A,  BCR AM,  CCA,  AOPCi 
and  OPC,  where  P,  Q,  P  and  Q  are  required  to  be 
positive^semi-def inite.  In  view  of  the  above 
Lemma,  ?)p  and  QP  are  all  semi-simple  with  real, 
non-negative  eigenvalues.  Thus  all  the  methods 
displayed  construct  the  projection  r  as  the  sum  of 
Nc  (disjoint)  eigenprojections  associated  with  the 
Nc  largest  eigenvalues  of  a  semi-simple  matrix. 

Not  shown  in  Table  1  is  the  computational  algo¬ 
rithm,  AOPCn  for  solution  of  the  optimal  projection 
equations.  This  algorithm  proceeds  as  follows: 


where  a  Is  the  diagonal  matrix  of  eigenvalues 
of  M. 

Now,  letting  u«  denote  the  Kth  column  of<6  and 
vj  the  Kth  row  of  <j>"1,  (9)  may  be  expressed  as: 


M  *  j)  *kuKvk  (10. a) 

k'1 

where  the  sets  of  vectors  luff,  IvJ  are 
mutually  biorthonormal— i.e.:  *  '  1 


T  f  1  !  K  «  j 

^“j  *  1 

K  lo  ;  K i  J 


(10. b) 


and  where  we  adopt  the  convention  that  the  Ag's 
are  arranged  in  order  of  descreasing  magnitude: 


|A.il  >  1*2 1  £  •••  *IAn-iI  *  IAnI 


(10. c) 


(10. a)  is  clearly  analogous  to  the  standard  re¬ 
sult  for  the  spectral  decomposition  of  a  normal 
matrix.  For  this  reason,  we  may  term  the  quantify: 


*kW  A  uKvJ 


(11) 


the  eigen-projection  of  M  associated  with  the 
Kth  eigenvalue  (under  convention  (10. c)).  In  view 
of  (10. b),  the  7>k[M]  form  a  set  of  unit  rank, 
mutually  disjoint  projections: 


("icCMJ)*  *  *kWJ 

»tk[M]  [M]  «  0  if  K  1  j 


\  (12) 


and  M  is  written: 


M 


N 


£ 


K«1 


A.R  ffK(M] 


(13) 


AOPCn 

1)  To  start,  set  «  iN 

2)  Using  the  previous  iterate,  tk-1,  /or  r,_ 
solve  the  OPC  equations  for  P,  Q,  £  and  Q. 

3)  Determine  the  eigenvalues  and  eigenvectors 

of  01s  and  form  the  eigenprojections 
"kCQP];  K  •  1 . N. 

(In  general  there  will  be  N>NC  non-zero 
eigenvalues.  If  there  are  exactly  Nc  non¬ 
zero  eigenvalues  at  this  point,  then  the 
OPC's  are  satisfied  identically.) 

4)  Set  Tk  equal  to  the  sum  of  eigenprojections 
corresponding  to  the  Nc  largest  eigen¬ 
values  of  QP. 

5)  Terminate  if  either  (a)  K  *  n  or  (b)  ratio 
of  the  (Nc  +  1)th  to  the  Ncth  eigenvalues 
of  QP  falls  below  some  preassigned  conver¬ 
gence  tolerance,  c  «  1  (in  which  case  the 
optimal  projection  conditions  are  satisfied 
to  an  acceptable  approximation).  Other¬ 
wise,  Increment  K  and  return  to  Step  2. 

In  the  following  discussion,  we  shall  con¬ 
stantly  refer  to  Table  1  using  the  equation  desig¬ 
nations  indicated— i .e.  the  first  OPC  equation  will 
be  referred  to  as  Equation  (OPC-a),  etc. 
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Mill 


CCA 

KM 

*  AWC, 

OK 

0  •  PA  •  »TP  •  i  ♦  Rj 

0  •  PA  ♦  ATP  -  p  ♦ 

0  •  PA  ♦  ATP  -  t  ♦  »J 

0  •  pa  .  atp  - 1 .  », 

(•) 

0  •  QAT  .  AO  -  5  •  V, 

0  •  0*T  ♦  AQ  -  4  «  *, 

0  ■  0AT  ♦  AQ  -  5  ♦  *, 

0  •  QAT  ♦  AQ  -  s  v, 

(t>) 

0  •  OaJ  •  Ap  Q  .  5 

o  •  Wj  *  *  S 

o  •  ♦  4 

0  •  8aJ  ♦  Apd  .  J 

(c) 

0  -  Mt  ♦  »  .  t 

0  ■  \  *  *5  *  *  » 

(«) 

t  •£  7TJ551 

k-1  1 

ft  TTkltfJ 

k-1  * 

*«iu7Tkt&M 

7Tk[ VI 
>•>  * 

(») 

Theoretical  Comparisons 


The  above  description  of  A0PCn  together  with 
Table  1  summarizes  all  the  design  methods  very 
succinctly.  The  similarities  among  the  methods 
are  evident.  First,  as  already  noted,  all  meth¬ 
ods  construct  the  compensator  projection  from  the 
eigenprojections  associated  with  the  Nc  largest 
eigenvalues  of  a  product  of  two  non-negative  defi¬ 
nite  matrices.  This  provides  additional  motivation 
for  the  use  of  the  term  "optimal  projection"  in 
connection  with  the  formulation  of  [4,5]  since, 
there,  the  projection  is  determined  via  optimality 
not  balancing  considerations. 


Secondly,  all  methods  compute  the  cost  ma¬ 
trices  Q  and  P  as  solutions  to  Riccati  equations 
or  modified  Riccati  equations.  Furthermore,  BCRA, 
BCRAM  §nd  0P£  construct  t  from  the  product  tJP 
where  Q  and  P  are  either  controllability  and  Ob¬ 
servability  grammians  for  the  compensator  (as  in 
the  case  of  BCRA)  or  are  closely  analogous  quanti¬ 
ties;  The  fact  that  CCA  lacks  a  Lyapunov  equation 
for  P  entails  less  of  a  distinction  than  might 
first  be  thought  since  the  term  p  («PIP)  appearing 
in  (CCA-e)  is  essentially  the  nonhomogeneous, 
driving  term  in  the  Lyapunov  equations  determining 
PJn  the  other  methods.  Thus,  the  eigenvalues  of 
Qp  (termed  the  "component  costs"  in  [3])  assign  a 
relative  weighting  to  the  eigenprojections  in  a 
manner  analogous  to  the  other  methods. 


In  connection  with  equations  (a)  and  (b)  of 
the  various  methods,  it  was  stated  in  [14]  that  the 
optimal  projection  design  is  simply  the  projection 
of  an  LQG  controller.  Equations  (7)  and  (8)  show 
that  this  would  indeed  be  so  if  P  and  Q  in  (7)  were 
determined  as  solutions  to  the  LQG  Riccati  equa¬ 
tions  (as  In  CCA,  BCRA  and  BCRAM).  However,  in 
contrast  to  the  suboptimal  approaches,  the  OPC 
equations  for  P  and  Q  are  modified  Riccati  equa¬ 
tions  containing  additional  terms  involving  the 
projection.  Thus,  except  under  very  special  cir¬ 
cumstances,  the  optimal  fixed-order  compensator  is 
generally  not  a  projection  of  an  LQG  design. 


indirect  result  of  this  feature  is  that  only  OPC 
fully  accounts  for  the  fact  that  the  loop  is  being 
closed  by  an  Nc  -  order  compensator.  In  fact,  as 
mentioned  previously,  OPC  constitutes  the  first- 
order  necessary  conditions  for  the  optimization 
problem--the  optimal  fixed-order  controller  design 
must  entail  satisfaction  of  the  optimal  projection 
equations.  Also,  under  mild  geometric  restrictions, 
solution  of  the  optimal  projection  equations  guaran¬ 
tees  closed-loop  stability.  In  view  of  these  prop¬ 
erties,  OPC  can  serve  as  a  theoretical  standard  of 
comparison  for  all  the  other  (suboptimal)  methods. 

Judging  from  the  appearance  of  the  design  equa¬ 
tions,  BCRAM  is  the  one  suboptimal  method  most  sim¬ 
ilar  to  OPC.  In  fact,  as  indicated  in  Table  1, 

BCRAM  and  AOPCi  are  identical.  Basically,  the 
BCRAM  equations  are  the  optimal  projection  equations 
with  the  coupling  terms  in  Ti  omitted,  and  numerical 
evidence  presented  in  [2]  suggests  that  BCRAM  gives 
improved  performance  over  BCRA  (from  which  BCRAM  was 
originally  derived  as  a  modification). 

Note  that  the  coupling  terms  rfi>Tl  and  >iq’iT 
in  OPC  necessitate  an  iterative  solution  algorithm 
such  as  A0PCn  while  CCA,  BCRA  and  BCRAM  compute 
the  compensator  projection  in  only  one  step.  Never¬ 
theless,  the  suboptimal  methods  cannot  subsequently 
improve  r  if  it  happens  to  result  in  an  unsuitable 
design  (with  poor  performance  or  Instability). 

A0PCn,  in  contrast,  offers  the  mechanism  for  pro¬ 
gressive  refinement  of  the  design  to  achieve  a  con¬ 
troller  which  is  as  nearly  optimal  as  desired. 

A  further  issue  of  general  importance  is  wheth¬ 
er  or  not  the  various  methods  can  produce  a  minimal 
order  optimal  compensator.  In  other  words,  there 
may  exist  a  compensator  of  order  M<N  which  yields 
the  same  performance  as  a  ful-l-order  (NC=N)  com¬ 
pensator.  It  is  highly  desirable  that  the  selected 
design  method  be  capable  of  producing  such  a  design 
when  it  exists.  It  turns  out  that  all  methods  con¬ 
sidered  here  meet  this  requirement.  This  is  sub¬ 
stantiated  for  BCRA,  BCRAM  in  [1]  and  [2]  respec¬ 
tively. 


Indeed,  the  one  striking  distinction  is  that 
in  OPC,  the  equations  determining  P,  Q,  P  and  Q  In¬ 
volve  the  compensator  projection  explicitly.  In 
essence,  the  terms  rfvTL  and  TiRTiT  (which  are 
lacking  in  the  suboptimal  approaches)  serve  to 
couple  the  "model  reduciton"  portion  of  the  prob¬ 
lem  (equations  (OPC-C)  and  (OPC-d))  with  the  gain 
computation  portion  ((OPC-a)  and  (OPC-b ) ) .  An 


The  capability  of  OPC  to  yield  minimal  order 
compensators  follows  generally  from  their  status  as 
optimality  conditions.  More  specifically,  however, 
if  the  LQG  compensator  has  unobservable  or  uncon¬ 
trollable  poles,  then  the  associated  subspaces 
appear  in  the  null  space  of  (Jp .  This  connection 
between  ynobservable/uncontrollable  poles  and  the 
rank  of  QP  was  explored  in  [5].  Thus,  if  a  minimal 
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realization  of  order  M  exists,  then  rank  [QP]  «  M 
and  setting  Nc  *  M  in  OPC  gives  the  desired  reduced 
order  compensator. 

To  illustrate  the  point,  consider  the  example 
given  in  [2].  In  our  notation,  the  defining  ma¬ 
trices  are: 


The  second-order,  LQG  compensator  is: 


{and  the  property  was,  in  fact,  never  claimed  by 
the  authors).  In  view  of  this  distinction,  and  be¬ 
cause  the  LQG  -  balancing  method  has  been  evaluated 
and  compared  extensively  in  [2]  and  [3],  it  is  not 
given  detailed  consideration  in  the  present  paper. 


4.  Numerical  Comparison  Using 
A  Common  Example  Problem 


Finally,  we  explore  Issues  relating  to  practi¬ 
cal  design  efficiency  by  applying  two  of  the  meth¬ 
ods  considered  here  to  the  same  example  problem. 

We  considered  pointing  and  shape  control  of  the 
“Solar  Optical  Telescope"  spacecraft  example  dis¬ 
cussed  in  [3].  The  original  44  mode  model  was  re¬ 
duced  to  10  modes  (8  elastic  and  2  rigid-body  modes) 
by  a  Modal  Cost  Analysis  in  [13].  In  the  notation 
used  here,  the  matrices  defining  this  20-state 
problem  are: 


I  *  0.001 


y  (25) 


U  '  -[2,  0]q 

It  is  clear  from  inspection  of  (20)  that  the  mini¬ 
mal  (unity  order)  compensator  is  obtained  by  delet¬ 
ing  q2  to  get: 


V  *  diag  [wk] 
k*1 . 20 

B  *  O3I  ,  C  -  [  P.  0] 


(26) 


q  * 

u  * 

At  the  sa« 
the  optima 
solution: 


*3q  +  y 
*2q 


(21) 


ca-'-ca 


y  (22) 


Vi 

V2 


10-  [°p 
10'15 


[0/ST] 


yield  the 

f  J> 

n  fi 

0 

o  i  1 

[o 

0 

10 

0,  [P.o]  a. 

1 

-*  L° 

0 

io-3  j 

•  (28) 

R2  «  pig 


(27) 


P  •  17/4  +  16/17 


or 

r  *  [1.  0],  g  *  [1,or]  (24) 


Using  these  results  with  (6)  and  (7)  shows  that 
K  *  2,  F  *  1  and  Ac  «  -3.  Thus,  OPC  yields  the 
minimal  order  compensator,  (21).  When  one  applies 
the  Iterative  solution  scheme,  A0PCn\  it  Is  found 
that  the  correct  projection  and  the  desired  values 
of  K,  F  and  Ac  are  produced  on  the  first  Iteration 
Further  iterations  beyond  AOPC-j  yield  no  change  In 
r,  K,  F  and  A£.  Incidentally  this  also  Illustrates 
how  both  AOPCi  and  BCRAM  yield  the  minimal  order 
compensator. 


where  the  modal  frequencies,  (  tu^,  k  *  1,...,  10) 
and  matrices  p  and  ?  are  given  in  Table  1  and  2, 
respectively,  of  [3].  In  (28. b),  pis  a  positive 
scalar  used  to  adjust  the  relative  weighting  of  the 
state  and  control  input  penalties  of  the  perform¬ 
ance  index.  Clearly,  overall  controller  authority, 
actuator  mean-square  force  levels  and  compensator 
bandwidth  are  all  inversely  proportional  to  p. 

Here,  we  discuss  numerical  results  for  p  c 
[0.01,  100.0]  and  for  Nc  in  the  range  from  20  to  4 
for  design  methods  CCA  and  A0PCn.  The  ap¬ 

proach  adopted  for  design  comparison  is  to  plot 
"regulation  cost"  (E[x'Rix])  as  a  function  of  "con¬ 
trol  cost"  (E[uTu]j  (obtained  by  varying  p)  for  each 
value  of  Nc  and  for  each  of  the  design  methods. 

Results  for  these  tradeoff  curves  are  shown  in 
Fig.  1.  The  very  bottom-most  curve  represents  the 
full-order  (20  states)  LQG  design.  Since  this  is 
the  best  obtainable  when  there  is  no  restriction 
on  compensator  order,  the  problem  is  to  obtain  a 
lower  order  design  whose  tradeoff  curve  is  as  close 
to  the  LQG  results  as  possible. 


At  this  point,  it  should  be  mentioned  that  the 
new  "LQG  -  balancing"  method  of  Verriest  [16,  17] 
and  Jonckheere  and  Silverman  [18]  will  not  always 
yield  a  minimal  order  compensator  when  it  exists 


The  thin  black  lines  in  Figure  1  show  the 
Nc*10,  6,  and  4  designs  obtained  via  Component  Cost 
Analysis,  where  Nc  denotes  the  compensator  dimen¬ 
sion.  These  results  were  obtained  in  [3]  using  the 
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Fig.  1  Performance  Tradeoff  Curves  For  Component 
Cost  Analysis  and  Optimal  Projection 
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Fig.  2  Optimal  Projection  Results  for  S.O.T.  Ex¬ 
ample  -  Percent  Performance  Increase  Over 
LQG  Design 


full  design  algorithm  described  in  Appendix  A,  [3] 
-  including  the  refinements  of  steps  III. a  through 
lll.e.  Note  that  the  10th  and  6th  order  compensa¬ 
tor  designs  are  quite  good,  but  when  compensator 
order  is  sufficiently  low  (Nc*4)  and  controller 
bandwidth  sufficiently  large  (P< 5.0),  the  method 
fails  to  yield  stable  designs.  This  difficulty  is 
characteristic  of  suboptimal  techniques,  and  In 
fairness,  ft  should  be  noted  that  other  suboptimal 
design  methods  (such  as  the  LQG  balanced  design 
method  proposed  by  Verrlest  [16,  17])  fail  to  give 
stable  designs  for  compensator  orders  below  10. 


Note  from  Table  1  that  one  iteration  of  A0PCn 
requires  only  slightly  more  computation  than  CCA 
since  CCA  Involves  solution  of  only  one  Lyapunov 
equation  and  lacks  an  analogous  equation  for  P. 

Thus,  it  may  be  estimated  that  the  OPC  results  given 
here  required  rougly  4  to  8  times  the  computational 
effort  of  CCA.  This  Increased  computational  burden 
for  OPC  is  offset,  in  the  higher  bandwidth  cases,  by 
the  reliable  production  of  very  low-order  but  excel¬ 
lent  performance  designs. 

5.  Conclusions 


In  contrast,  the  width  of  the  grey  line  In 
Figure  1  encompasses  all  the  optimal  projection  re¬ 
sults  for  compensators  of  order  10,  6,  and  4.  To 
provide  a  more  detailed  picture  of  the  optimal  pro¬ 
jection  results.  Figure  2  shows  the  percent  of  to¬ 
tal  performance  Increase  relative  to  the  full-order, 
LQG  designs  (the  quantity  (100  x  ( JS(NC)-JS(20) )/ 

Js ( 20 ) )  as  a  function  of  1/p  (proportional  to  con¬ 
troller  bandwidth  and  to  actuator  force  levels)  for 
the  various  compensator  orders  considered. 

Even  for  the  4th  order  design,  the  optimal 
projection  performance  is  only  ~5  percent  higher 
than  the  optimal  full-order  design.  Furthermore, 
the  performance  index  for  the  optimal  projection 
designs  increases  montonically  with  decreasing 
controller  order  -  as  it  should.  Such  is  not  nec¬ 
essarily  the  case  for  suboptimal  design  methods. 

The  OPC  results  were  actually  obtained  via  the 
iterative  algorithm  A0PCn  outlined  in  Section  2 
(and  described  In  more  detail  in  [6]).  In  all 
cases  except  NC  *  4,  p  *  0.5,  adequate  convergence 
in  performance  was  obtained  In  four  iterations. 

For  the  case  NC  »  4,  p  *  0.5,  four  iterations  pro¬ 
duced  an  unstable  closed-loop  system.  This  Is 
apparently  due  to  inadequate  convergence  in  this 
case,  since  four  additional  Iterations  sufficed  to 
give  the  results  indicated  in  Fig’s  1-2. 


In  the  preceeding,  we  have  established  a  com¬ 
mon  theoretical  framework  within  which  various 
"balancing”  approaches  (BCRA,  BCRAM  and  CCA)  to  con¬ 
troller  order  reduction  may  be  compared  directly  with 
a  new  formulation  (OPC)  of  the  quadratically  optimal 
fixed-order  compensation  problem.  The  optimal  and 
suboptimal  approaches  were  found  to  be  highly  anal¬ 
ogous,  each  technique  characterizing  the  reduced- 
order  compensator  by  a  projection  of  rank  equal  to 
the  desired  compensator  dimension.  The  major  dis¬ 
tinction  among  the  methods  is  that  in  OPC,  the  pro¬ 
jection  arises  as  a  consequence  of  optimality  condi¬ 
tions,  while  in  the  suboptimal  methods  it  is  con¬ 
structed  on  the  basis  of  other  considerations.  The 
ramifications  of  this  distinction  were  explored  by 
comparing  performance  results  obtained  via  CCA  and 
OPC  for  the  same  example  problem. 

The  above  comparisons  lead  to  the  following 
conclusions: 

1)  In  all  cases  considered,  the  total  perfor¬ 
mance  index  for  OPC  is  less  than  for  CCA. 
Even  in  the  case  Nr  *  4,  the  OPC  regulation 
vs.  control  cost  plot  is  very  close  to  that 
of  the  full-order  (Nc  ■  20)  daslgn. 

2)  The  performance  Index  for  OPC  Increases 
monotonlcally  with  decreasing  controller 
order.  Such  is  not  generally  the  case  for 
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suboptlmal  design  methods. 

3)  Various  suboptlmal  methods  (including  CCA) 
fail  to  produce  stable  designs  when  the 
compensator  order  and  control  input  weight¬ 
ing  parameter,  p  ,  are  both  sufficiently 
small.  Contrasting  results  with  OPC  solu¬ 
tions,  we  see  that  this  effect  is  often  an 
artifact  of  the  suboptimal  character  of 
the  design  methods  and  does  not  necessari¬ 
ly  imply  non-existance  of  stabilizing  com¬ 
pensator  designs  for  small  Nc. 

4)  The  computational  effort  associated  with 
the  use  of  CCA,  BCR A  or  BCRAM  is  roughly 
equal  to  the  effort  of  one  iteration  of 
the  A0PCn  algorithm  used  for  solution  of 
the  optimal  projection  conditions.  On  the 
other  hand,  excellent  results  for  OPC  were 
obtained  in  4  -  8  iterations.  Thus,  while 
it  is  significantly  more  reliable  in  pro¬ 
ducing  satisfactory  designs,  OPC  does 
necessitate  an  increase  in  computational 
burden. 

The  above  remarks  would  seem  to  motivate  con¬ 
tinued  exploration  of  the  optimal  projection 
approach  and  additional  efforts  to  increase  its 
computational  efficiency.  It  is  also  possible  that 
the  optimal  projection  equations  may  provide  the 
insight  needed  to  devise  improved  suboptimal  (and 
non-iterative)  design  techniques. 

APPENDIX 

Here  we  verify  the  statements  made  in  Section 
2.  from  equation  (4)  through  (16)  and  Table  1  for 
the  design  methods  BCRA,  BCRAM  and  CCA. 


Here,  consider  the  internal  balancing  approach 
to  model  reduction  as  applied  to  a  full-order  LQG 
compensator: 

q  =  Acq-+FY  (A.1) 

u  «  -Kq 

where  Ac,  K  and  F  are  given  by  (6)  and  (7)  with 
g  *  r  =  In  and  P  and  Q  satisfying  the  LQG  Riccati 
equations.  Considering  V~*y  as  the  system  input 
and  R^  u  as  the  output,  and  assuming,  in  this  dis¬ 
cussion  that  Ac  Is  asymptotically  stable,  the  con¬ 
trollability  and  observability  Grammians  are  de¬ 
termined  as  unique  positive  semidefinite  solutions 
to  (see  Ref.  [1],  p.  21  and  make  allowance  for  no¬ 
tation  and  input,  output  definitions). 

0  *■  Acwl  +  WcAI  +  FV,FT 

T  ]  ,  T  (A.2) 

0  *  A^Wq  +  W0AC  +  k'r,k 

Having  obtained  these  quantities,  it  is  clear 
from  the  discussion  of  [1],  p.  24,  or  from  the 
Lemma  in  the  main  text  of  this  paper  that  there 
exists  a  transformation  with  matrix  P  such  that  Wc 
and  W0  can  both  be  reduced  to  the  diagonal  matrix 
of  2nd  order  modes,  (  2  in  Moore's  notation).  t 
Referring  to  the  expressions  given  for  Wc  and  W0 
under  coordinate  transformation  given  In  [1], 
p.  23,  we  have 


Wc(P)  *  P'1  £*P-1T, 
Wq(P)  «  pT2*p 


Obviously,  (A. 3)  indicates  that  P'1  is  the  right 
eigenvector  matrix  and  2  ,  the  eigenvalues  of  W^W £. 
Thus,  employing  the  notation  introduced  in  section 
2: 

*  a  N  4  i  i 

WCW°  -  k|1  2*  7TkC  Wcwo  ]  (A. 4) 

where  the  zr^  are  formed  directly  from  the  columns 
and  rows  of  P*1  and  P,  respectively.  As  recommend¬ 
ed  in  the  Internal  Dominance  section  of  [1],  one 
forms  a  reduced-order  model  of  (A.1)  by  deleting, 
in  the  internally  balanced  basis,  all  states  asso¬ 
ciated  with  the  N  -  Nc  smallest  second  order  modes. 
Setting: 

r=  1st  Nc  rows  of  P 

T  -  (A. 5) 

9  =  1st  Nc  columns  of  P*1 


this  is  tantamount  to  defining  a  reduced  order  mod¬ 
el  of  (A.1)  via  (6)  and  (7).  Thus,  (6)  and  (7)  are 
verified  for  BCRA. 

Now  r  and  g  as  obtained  above  are  obviously 
one  factorization  of  the  projection: 


t? :] 


ffkt  wcwo3 


However,  any  factorization  of  r  can  be  related  to  a 
given  one  by  a  similarity  transformation  of  the  re¬ 
duced  order  system.  Since  the  second-order  modes 
and  the  compensator  performance  are  invariant  under 
such  transformations,  the  particular  factorization 
is  immaterial.  Thus  the  Internal  balancing  ap¬ 
proach  to  controller  reduction  is  mathematically 
equivalent  to  defining  a  controller  of  the  form  (6) 
and  (7)  by  (1)  computing  P  and  Q  via  LQG  Riccati 
equations  (BJRA-a  and  B£RA-b  of  Table  1)  (2)  deter¬ 
mining  Q  *  Wc  and  P  =  W0  from  eauations  (A.2)  • 

(setting  Q  -  w£,  P  *  W*.  F  ■=  QC^;1,  K  -  RJ^P  and 
and  Ar  »  A  -  2P  -  QZ  ,  equations  (A.2)  become 
BCRA-C  and  BCRA-d),  (3)  forming  the  compensator 
projection  via  (A. 6)  or,  equivalently,  (BCRA-e), 
and  then  effecting  any  suitable  factorization  in 
accordance  with  (5. a,  b).  This  verifies  (4) 
through  (16)  and  Table  1  for  BCRA. 


This  design  method  is  a  modification  of  BCRA 
proposed  in  [2]  to  circumvent  difficulties  in  the 
application  of  BCRA  when  Ac  is  not  asymptotically 
stable.  Referring  to  equations  (2. 7. a)  and  ( 2 . 7 . b ) 
of  [2],  the  only  change  from  BCRA  is  that  in  (BCRA- 
c),  Ac  is  replaced  byA-BK*A-2P  and  in 
(BCRA-d),  Ar  is  replaced  byA-FC*A-QS.  Note 
that  since  P  and  Q  are  LQG  Riccati  equation  solu¬ 
tions,  A  -  I  P  and  A  -  QS  are  asymptotically  sta¬ 
ble  so  that  the  method  has  a  wider  field  of  appli¬ 
cation. 
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In  view  of  the  preceding  discussion,  of  BCRA, 
the  simple  changes  noted  suffice  to  substantiate 
(4)  through  (16)  and  Table  1  for  BCRAM. 


For  simplicity,  we  consider  only  the  basic 
Component  Cost  Analysis  approach  (termed  the  Cost- 
Decoupled  Controller  Design  Algorithm)  given  in  [31 
omitting  the  special  procedures  described  after 
Theorem  1  of  Section  III,  Ref.  [3]  for  identifying 
"nearly"  unobservable  states  by  use  of  a  set  of 
cost  decoupled  coordinates  closely  associated  with  a 
generalized  Hessenburg  representation.  This  means, 
that  we  confine  attention  to  the  algorithm  in  Appen¬ 
dix  A  of  [3]  without  the  computational  refinements 
of  steps  III  and  IV. 

Referring  now  to  Appendix  A,  Ref.  [3],  the 
Cost-Decoupled  design  algorithm  starts  in  steps  Ia- 
Ib  by  computing  the  full-order,  LQG  compensator  de¬ 
sign.  Allowing  for  differences  in  notation,  the 
expressions  (A. 3b)  -  (A. 3d)  of  Ref.  [3]  are  identi¬ 
cal  to  (6)  and  (7)  with  T  =  g  =  Ity.  Similarly 
(A.3e)  and  (A.3f)  of  Ref.  [3]  are  our  equations 
(CCA-a)  and  (CCA-b)  in  Table  1. 

Next  consider  step  I.C.  The  quantities  X,  BG 
and  FVFT  in  [3]  correspond  to  Q,  -  TP  and  QlQ  in 
our  notation.  Thus,  equaiton  (A. 4)  of  Ref.  [3], is 
equivalent  to  our  (CCA-c)  for  determination  of  Q. 
Noting  that  GTRG  of  [3]  corresponds  to  the  quantity 
PIP,  steps  I.d  through  II. a  define  a  transforma¬ 
tion  matrix  Ti: 

Ti  *  exeu  (A. 7) 

where  0X  is  the  square  root  of  Q: 

0  «  exej  (A. 8) 

and  eu  is  the  orthonormal  modal  matrix  of 
ejpxpex: 

eu  exPE  p6xeu  •  dlag.  |l/k| 

K  (A. 9) 

V1  >  V2  >  •••  £VN  2:  0 

where  the  Vr's  are  the  nonnegative  "component 
costs". 

Some  simple  manipulation  of  (A. 7)  -  (A. 9) 
shows  that  Ti  is  defined  such  that: 


(OPZPJTt 

Tir9Tin 

T^PX  PT ! 


Ti  diag 
k 


(A. 10) 


-1.e.„Ti  is  a  right  eigenvector  matrix  of  the  pro¬ 
duct  Q(PXP)  yielding  a  state  transformation  which 
simultaneously  diagonalizes  the  factors  Q  and  PIP. 

Step  III  basically  effects  a  refinement  of  Ti 
to  Isolate  weakly  observable  components  of  the 
state.  For  convenience  In  the  exposition  we  omit 
this  here  to  obtain  a  statement  of  the  "basic"  CCA 
algorithm.  Thus,  set  T  *  Ti  In  Step  IV  and  proceed 
to  Step  V.  Equations  (A.13e)  and  (A.13f)  of  [3] 
read: 


[Tr,  TT]  4  Ti  ;  Tr  c  RNxNc 


A  T-1 
=  1 


Lr  c  RNcxN 


(A. 11) 


Clearly  TrLr  defines  a  rank  Nc  projection  on  RN 
which,  in  view  of  (A. 10),  is  the  sum  of  the  eigen- 
projections  of  QPXP  associated  with  the  Nc  largest 
eigenvectors.  This  verifies  (CCA-e)  of  Table  1. 
Also,  comparing  t  =  TrLr  with  (5. a),  it  is  seen 
that  Tr  and  Lr  correspond  to  gT  and  T,  respective¬ 
ly.  Thus,  allowing  for  notational  changes,  equa¬ 
tions  (A. 13b  -  A. 13d)  of  [3]  read: 


T(A  -  QX 

tqcTv;1 

R;1  Bfp 


x  P)gT 


(A. 12) 


where  P  and  Q  are  solutions  to  (CCA-a)  and  (CCA-b). 

(A-12)  are  obviously  equivalent  to  (6)  and  (7). 
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The  Optimal  Projection  Equations  for  Fixed-Order 
Dynamic  Compensation 

DAVID  C.  HYLAND  AND  DENNIS  S.  BERNSTEIN 

Abstract — First-order  necessary  conditions  tor  quadratkaUy  optimal, 
steady-state,  fixed-order  dynamic  compensation  of  a  linear,  time-invari¬ 
ant  plant  in  the  presence  of  disturbance  and  observation  noise  are  derived 
in  a  new  and  highly  simplified  form.  In  contrast  to  the  pair  of  matrix 
Riecatl  equations  for  the  fnil-order  LQG  case,  the  optimal  steady-state 
fixed-order  dynamic  compensator  is  characterized  by  four  matrix  equa¬ 
tions  (two  modified  Riccati  equations  and  two  modified  Lyapunov 
equations)  coopted  by  a  projection  whose  rank  is  precisely  equal  to  the 
order  of  the  compensator  and  which  determines  the  optimal  compensator 
gains.  The  coupling  represents  a  graphic  portrayal  of  the  demise  of  the 
classical  separation  principle  for  the  reduced-order  controller  case. 

I.  INTRODUCTION 

Because  of  constraints  imposed  by  on-line  computations,  dynamic 
controllers  for  high-order  systems  such  as  flexible  spacecraft  must  be  of 
relatively  modest  order.  Hence,  this  paper  is  concerned  with  the  design  of 
quadra tically  optimal,  fixed-order  (i.e.,  reduced-order)  dynamic  compen¬ 
sation  for  a  plant  subject  to  stochastic  disturbances  and  nonsingular 
measurement  noise.  Since  white  noise  in  all  measurement  channels 
precludes  direct  output  feedback  (see  Section  □),  only  purely  dynamic 
controllers  are  considered.  The  requirements  for  resolution  of  this 
optimization  problem  include  the  following. 

1)  Conditions  for  the  existence  of  an  optimal,  stabilizing  compensator 
of  the  prescribed  order.  (In  the  full-order  case  these  are  the  usual 
subilizability  and  detectability  conditions  of  LQG  theory.)  * 

2)  Stationary  conditions,  i.e.,  first-order  necessary  conditions,  ren¬ 
dered  in  a  tractable  form  to  facilitate  developments  in  items  3)  and  4) 
below.  (In  the  full -order  case  these  conditions  are  precisely  the  LQG  gain 
relations  together  with  the  regulator  and  observer  Riccati  equations.) 

3)  Sufficiency  conditions,  i.e.,  additional  restrictions  on  solutions  of 
the  first-order  necessary  conditions  which  characterize  local  minima  and 
single  out  the  global  minimum.  (In  the  full-order  case  the  global 
minimum  is  distinguished  by  the  unique  nonnegative-deflnite  solutions  to 
the  LQG  Riccati  equations.) 

4)  Convergent  numerical  algorithms  for  simultaneous  satisfaction  of 
the  necessary  and  sufficient  conditions.  (In  the  full-order  case  numerical 
algorithms  have  been  devised  which  take  full  advantage  of  the  highly 
structured  form  of  the  Riccati  equations.) 
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The  present  paper  deals  exclusively  with  item  2).  Although  the 
stationary  conditions  for  the  fixed-order  compensation  problem  have  been 
written  down  (see  [1]-[12],  for  example),  ftill  exploitation  has  undoubt¬ 
edly  been  impeded  by  their  extreme  complexity  [see  (3.3H3.11)].  What 
has  been  lacking,  to  quote  the  insightful  remarks  of  [9],  “is  a  deeper 
understanding  of  the  structural  coherence  of  these  equations.”  The 
contribution  of  the  present  paper  is  to  show  how  the  originally  very 
complex  stationary  conditions  can  be  transformed  without  loss  of 
generality  to  much  simpler  and  more  tractable  forms.  The  resulting 
equations  (2. 10)— (2. 17)  preserve  the  simple  form  of  LQG  relations  for  the 
gains  in  terms  of  covariance  and  cost  matrices  which,  in  turn,  are 
determined  by  a  coupled  system  of  two  modified  Riccati  equations  and 
two  modified  Lyapunov  equations.  This  coupling,  by  means  of  a 
projection  (idempotent  matrix)  whose  rank  is  pecisely  equal  to  the  order 
of  the  compensator,  represents  a  graphic  portrayal  of  the  demise  of  the 
classical  separation  principle  for  the  reduced-order  controller  case.  When, 
as  a  special  case,  the  order  of  the  compensator  is  required  to  be  equal  to 
the  order  of  the  plant,  the  modified  Riccati  equations  reduce  to  the 
standard  LQG  Riccati  equations  and  the  modified  Lyapunov  equations 
express  the  proviso  that  the  compensator  be  minimal,  i.e.,  controllable 
and  observable.  Since  the  LQG  Riccati  equations  as  such  are  nothing 
more  than  the  necessary  conditions  for  full-order  compensation,  we 
believe  that  die  “optimal  projection  equations”  provide  a  clear  and  simple 
generalization  of  standard  LQG  theory. 

Since  we  are  concerned  with  optimal  fixed-order  compensator  design, 
our  approach  does  not  represent  yet  another  model-  or  controller- 
reduction  scheme  along  the  lines  of  [13]— (17].  Indeed,  the  optimal 
projection  equations,  by  virtue  of  their  relatively  transparent  structure, 
can  reveal  the  extern  to  which  the  design  equations  of  a  given  ad  hoc 
reduction  scheme  conform  to  the  necessary  conditions  for  optimality.  For 
example,  the  oblique  projection  which  arises  in  die  present  formulation 
may  not  be  of  the  form  ['  ®]  even  in  the  basis  corresponding  to  the 
“balanced”  realization  [13]— [16].  These  issues  are  discussed  in  [18] 
where  the  results  of  [19]  are  simplified  by  means  of  the  approach  of  the 
present  paper  and  where  the  balancing  method  of  [131  is  reinterpreted  in 
the  context  of  optimality  theory. 

The  fact  that  the  optimal  projection  equations  consist  of  four  coupled 
matrix  equations,  i.e.,  two  modified  Riccati  equations  and  two  modified 
Lyapunov  equations,  should  not  be  at  all  surprising  for  the  following 
simple  reason.  Reduced-order  control-design  methods  often  involve  either 
LQG  applied  to  a  reduced-order  model  or  model  reduction  applied  to  a 
full-order  LQG  design.  Both  approaches,  then,  involve  the  solution  of 
precisely  four  equations:  two  Riccati  equations  (for  LQG)  plus  two 
Lyapunov  equations  (for  model  reduction  via  balancing,  as  in  [13]).  The 
coupled  form  of  the  optimal  projection  equations  is  thus  a  strong 
reminder  that  the  LQG  and  order-reduction  operations  cannot  be  iterated 
but  must,  in  a  certain  sense,  be  performed  simultaneously. 


II.  PROBLEM  STATEMENT  AND  THE  MAIN  THEOREM 


white  observation  noise  with  lx!  positive-definite  intensity  Pj;  w,  and 
w2  are  uncorrelated  and  have  zero  mean.  We  note  that  the  assumptions  of 
nonsingular  control  weighting  and  nonsingular  observation  noise  preclude 
the  use  of  direct  output  feedback  as  in 


u(t)  -  CcXc(0 + VcXO  (2.6) 

since  J  is  undefined  unless  (see  [7]) 


tr[f>fR,Z)cKJ  =  0  (H  /iAV'j-O).  (2-7) 


To  guarantee  that  J  is  finite  and  independent  of  initial  conditions  we 
restrict  our  attention  to  the  set  of  admissible  stabilizing  compensators 


a  d 


Bc,  Cc):  A  4 


is  asymptotically  stable 


} 


where  A  is  the  closed-loop  dynamics  matrix.  Since  the  value  of  J  is 
independent  of  the  internal  realization  of  die  compensator,  we  can  further 
restrict  our  attention  to. 


cu  £  HAc.Be.cc)e  a: 


(Ac.  Be)  is  controllable  and  (Ct,  Ac)  is  observable}. 

For  the  following  lemma  call  a  square  matrix  nonnegative  (respectively, 
positive)  semisimple  if  it  has  a  diagonal  Jordan  form  and  nonnegative 
(respectively,  positive)  eigenvalues.  Let  I,  denote  the  r  x  r  identity 
matrix. 

Lemma  2.1:  Suppose  Q,  P  G  R"x"  are  nonnegative  definite.  Then 
QP  is  nonnegative  semisimple.  Furthermore,  if  rank  QP  *  nc  then  there 
exist  G,  T  €  ®"rx"  and  positive-semisimple  M  €  R"<-xxt  such  that 


QP-CtMT, 

a.8> 

r 

(2-9) 

Proof.  The  result  is  an  immediate  consequence  of  [20,  Theorem 
6.2.5,  p.  123].  □ 

For  convenience  in  stating  the  Main  Theorem,  define 

I  £  BR~'BT,  £  £  CTV-lC 

and  call  G,  M,  and  T  satisfying  (2.8)  and  (2.9)  a  (G,  M,  O-factorization 
of  QP. 

Main  Theorem:  Suppose  (A„  Bc,  Cc)  €  Q,  solves  the  steady-stale 
fixed-order  dynamic-compensation  problem.  Then  there  exist  n  x  n 
nonnegative-definite  matrices  Q,  P,  Q,  and/5  such  that  A,,  B„  and  Ccare 
given  by  . 

AC-IV  Jl-W, 

(2.10) 

Be^TQCTV-\ 

(2.11) 

Ce‘-Rj'BTPCT 

(212) 

Given  the  control  system 

jl(f) «  Ax(t) + Bu(t)  +  w,(r),  (2.1) 

AO-CxfQ+wAO  (2.2) 

design  a  fixed-order  dynamic  compensator 

MO-AeXAO+BeXO,  (2.3) 

Uit)-CeXe(t)  (2.4) 

which  minimizes  the  steady-state  performance  criterion 

J(Ae,  Be,  Ce)  6  Km  ««/)’*,*(/)+ mt)r/t,moi  (2.5) 


where:  R',  xc  €  R"t,  nc  <  n,  A,  B,  C,  A„  Bc, 

Ce,  R|,  and  R]  are  matrices  of  appropriate  dimension  with  R,  (symmetric) 
nonnegative  definite  and  Rt  (symmetric)  positive  definite;  W|  is  white 
disturbance  noise  with  n  x  n  nonnegative-definite  intensity  Vx  and  Wj  is 


for  some  (G,  M,  r)-factorization  of  QP,  and  such  that  with  r  &  GIT  the 
following  conditions  are  satisfied: 

0  =  (A  -  tQ1)Q  +  Q(A  -  tQ2)t+  K|  +  rQZQrT,  (2.13) 

0  -  (A  -  1Pt)tP+P(A  -  IPr)  +  R,  +  rTPSPr,  (2.14) 

0  =  ri(A-E/»)<2+&A-r/>)r+e£e],  (2.15) 

0  =  [(A  -  Qi)TP+P(A  -  Qt)  +  PSF]t,  (2.16) 

rank  rank  P- rank  QP-nr.  (2.17) 

Remark  2.1:  Because  of  (2.9)  the  n  x  n  matrix  r  which  couplet  the 
four  equations  (2.13)-(2.16)  is  idempotent,  i.e.,  r3  ■  r.  In  general  this 
"optimal  projection"  is  an  oblique  projection  (as  opposed  to  an- 
orthogonal  projection)  since  it  is  not  necessarily  symmetric.  Note  that 
Sylvester's  inequality  and  (2.9)  imply  that  rank  r  •  nc. 

Remark  2.2:  Using  the  relations  Q  «  tQ  and  P  -  Pr  [see  (3.12)], 
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the  optimal  projection  equations  (2.13)— (2.16)  can  be  written  in  the 


equivalent  form 

0-AQ^QAt-¥Vi-Q1Q  +  t1QZQtt1,  (2.18) 

0-AtP+PA  +  R{-PS.P+ttJ*ZPt1,  (2.19) 

0-{A-ZP)Q+&A-XP)t+QIQ-t1QZQtt1,  (2.20) 

0-(A-Q2)tP+P(A-Q^)  +  PSP-ttxPSPt±  (2.21) 

where  rA  4 1,  -  r.  Note  that  in  the  full-order  case  nc  *  n,  t  -  G  -  T 


~  I,  and  thus  (2.18)  and  (2.19)  reduce  to  the  standard  observer  and 
regulator  Riccati  equations  and  (2.10)-(2.12)  yield  the  usual  LQG 
expressions.  Furthermore,  it  can  be  shown  that  (2.20),  (2.21),  and  (2.17) 
are  equivalent  to  the  assumption  that  (A„  Bc,  Ce)  is  controllable  and 
observable. 

Remark  2.3:  Since  QP  is  nonnegative  semisimple  it  has  a  group 
generalized  inverse  (£/V  given  by  GTM~'T  (see  e.g.,  [21,  p.  124]). 
Hence,  by  (2.9)  the  optimal  projection  r  is  given  by 

t-QP(QPY.  (2.22) 

Remark  2.4:  The  modified  Riccati  equations  (2.13)  and  (2.14)  are 
similar  to  the  (single)  ‘  ‘extended  algebraic  Riccati  equation’ '  which  arises 
in  the  static  output  feedback  problem  (see,  e.g.,  [22]). 

Remark  2.5:  Replacing  xc  by  Sxc,  where  S  is  invertible,  yields  die 
“equivalent”  compensator  (54,5  SBC,  CpS*1).  Since  JiAr.  B„  CJ 
«  J(SAf5~l,SBc,  CfS ~ ')  one  would  expect  the  Main  Theorem  to  apply 
also  to  (54,5  SBC,  C,S~l).  This  is  indeed  the  case  since  transforma¬ 
tion  of  the  compensator  state  basis  corresponds  to  the  alternative 
factorization  QP  -  (STG)T  (SMS-')  (ST)-  See  [10]  for  related 
remarks. 

Remark  2.6:  By  introducing  the  quasi-full -state  estimate  i  4  GTxc  € 
K"  so  that  tX  «  St  and  xc  *  Tjf  €  W"«,  (2.1)— (2.4)  can  be  written  as 

i  -Ax+BCct£+w„ 

i- 1 (A  -8eC+ BCc)tS + rBe(Cx  +  Wj) 

where  8C  4  QCTV- 1  and  Cc  4  -R3'BTP.  Although  the  implemented 
compensator  has  the  state  xc  €  I ft’*,  it  can  be  viewed  as  a  quasi-full-order 
compensator  whose  geometric  structure  is  entirely  dictated  by  the 
projection  r.  Sensor  inputs  are  annihilated  unless  they  are  contained 
in  [9l(r)] x  —  dtfr7).  where  91  and  <R  denote  null  space  and  range. 
Furthermore,  the  quasi-full-order  state  estimate  rS  employed  in  the 
control  input  is  contained  in  <R(r).  Thus,  <R(r)  and  <R (r7)  are  the  control 
and  observation  subspaces  of  the  compensator. 

m  PROOF  OF  THE  MAIN  THEOREM 

The  proof  given  here  considerably  simplifies  the  original  derivation 
given  in  [23]  and  [24],  Using  the  fact  that  is  open,  the  Fritz  John 
version  of  the  Lagrange  multiplier  theorem  can  be  used  to  rigorously 
derive  the  first-order  necessary  conditions  ([7],  see  also  [25]) 


0  -AQ+QT+P, 

(3.1) 

0 -ATP+PA+P, 

(3.2) 

0-P^Qa+P,Q„ 

(3.3) 

B'--(Pi'prjQ,  +  QT1)CTVi', 

(3.4) 

C,m  -R;'BT(PtQnQ;'+Pa), 

(3.3) 

where 

r  4  r  °  i  °  l 

L  0  B^BT  J  *  *  [  0  J 

and  (n 
nc  x  n 

+  ne)  x  (n  +  nt)  Q,  Pm  partitioned  into  n  x  it,  n 
r,  subblocks  as 

x  ne,  and 

<5»  T  C'  Gul  P-\  Pi  />l,l 

C  Lor,  Or  J’  L  P»  p*  J 

Expanding  (3.1)  and  (3.2)  yields 

0=AQ>  +  Q,AT+BC'QlJ  +  Q,l(BC')T+  V„  (36) 

0mAQn+  QuA  ? + BCeQ}+ Q\(BeC)*,  (3.7) 

0=A'Q2  +  QJAT+B'CQ]1+Ql3(B'OT+BtViBT,  (3.8) 

0  =  A  TPt  +  P,A  +  (BcO  TPTa + PnBeC+  R„  (3 .9) 

0  =  PnAc+A  TPa + (BjO  tPi + P,BC„  (3.10) 

0 = A  TPi + P>Ar+  (BCc)tP>2  +  CTRiCc.  (3.11) 

Writing  (3.8)  as  (see  [26],  [27]) 

0-(At+B'CQ,iQ;)Qi+  Qi(Ac+ BcCQl2QT)T+ B,  V2BT 


where  Qf  is  the  Moo  re-Penrose  or  Drazin  generalized  inverse  of  Q2.  it 
foUows  from  [28,  Lemmas  2.1  and  12.2]  that  Q2  is  positive  definite. 
Similarly,  (3.11)  implies  that  P2  is  positive  definite.  This  justifies  (3.4) 
and  (3.5). 

Now  define  the  n  x  n  nonnegative-definite  matrices  (see  [26],  [27]) 

Q  4  Q,-QaQ;'Qtv  p  4  Pt-PaP;'Plv 

Q  4  QnQi'QJi-  P&PnPi  'pr 

and  note  that  (3.3)  implies  (2.8)  and  (2.9)  with 

c  4  Q;'Q\V  m  4  QiPi,  r  a  _  p-'PTa 

Since  Q/>2  -  P-'rl(P\nQ1P\n)P\n,  M  is  positive  semisimple. 
Sylvester's  inequality  yields  (2.17).  Note  also  that 

<?=r QP-Pr.  (3.12) 

Next  (2.11)  and  (2.12)  follow  from  (3.4)  and  (3.5)  by  using  the 
identities 

Q,-Q+QP,-P+P.  (3.13) 

Qn‘&T.Pn'‘-PGT.  (3.14) 

Qi~rCrT,Pi-GPGT.  (3.15) 

Now  substitute  (2. 1 1),  (2. 12),  and  (3. 13H3. 15)  into  (3.6H3. 1 1)  and  use 
the  relations 

B'C-TQZ,  BC,- -SPGT, 

B'ViBj-TQtQTT,  CTRiC'-GPZPG7. 

« 

Then  (2.10)  follows  from  (3.8)-T(3.7).  Substituting  (2.10)  into  (3.7), 
(3.8),  (3.10),  and  (3.11)  shows  that  ((3.7 )G)T  and  -(3.10)T  are 
precisely  (2. 15)  and  (2. 16).  Since  Gr(3.8)G  =  (2.15)rand  rr(3.1 1)T  = 
r(2.16),  (3.8)  and  (3.1 1)  can  be  omitted.  Finally,  using  (3.12)  it  follows 
that  (2.13)  =  (3.6)  +  (2.15)r  -  (2.15)— (2.15)r  and  similarly  for 
(2.14).  □ 

IV.  DIRECTIONS  FOR  FURTHER  RESEARCH 

With  regard  to  the  existence  of  a  stabilizing  compensator,  known 
results  (e.g.,  [28]— [34])  can  be  exploited  to  a  great  extent.  A  numerical 
algorithm  for  solving  the  optimal  projection  equations  has  been  developed 
in  [24]  and  [35].  The  proposed  computational  scheme  is  philosophically 
quite  different  from  gradient  search  algorithms  [2],  [3],  [6],  [7],  [9],  [1 1], 
[36],  [37]  in  that  it  operates  through  direct  solution  of  the  optimal 
projection  equations  by  iterative  refinement  of  the  optimal  projection. 
Methods  for  eliminating  local  extrema  are  being  investigated  by  applying 
component  cost  analysis  [17],  Generalizations  of  the  optimal  projection 
equations  can  arise  by  considering  the  following  extensions  of  the  fixed- 
order  dynamic-compensation  problem. 

1)  Discrete-Time  System /Discrete-Time  Compensator:  Digital  im¬ 
plementation  can  be  modeled  by  a  discrete-time  compensator  with  control 
of  a  continuous-time  system  facilitated  by  sampling  and  reconstruction 
devices. 
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2)  Cross  Weighting  /Correlated  Disturbance  and  Observation 
Noise:  This  extension  is  stnighforward  and  entirely  analogous  to  the 
LQG  case  (see,  e  g.,  (3,  p.  351J). 

3)  Singular  Observation  Noise/Singular  Control  Weighting:  With 
due  attention  to  (2.7),  direct  output  feedback  can  be  used  in  the  singular 
case.  The  nature  of  the  problem  forebodes  all  of  the  difficulties  associated 
with  the  singular  LQG  problem.  Note  that  the  output  feedback  problem 
(22],  (38],  when  viewed  in  this  context,  is  highly  singular. 

4)  Infinite-Dimensional  Systems:  The  optimal  projection  equations 
have  been  extended  in  (39]  and  (40]  to  the  case  in  which  (2.1)  is  a 
distributed  parameter  system,  for  example,  a  partial  or  functional 
differential  equation. 

5)  Decentralized  Fixed-Order  Controller:  The  optimal  projection 
equations  can  be  derived  for  the  case  in  which  the  dynamic  controller  has 
a  fixed  decentralized  structure. 

6)  Parameter  Uncertainties:  The  original  derivation  in  (23]  treated  a 
Stratonovich  state-dependent  noise  model  representing  parameter  uncer¬ 
tainties  in  the  plant.  Further  consideration  of  control-  and  measurement- 
dependent  noise  raises  the  possibility  of  directly  including  the  impact  of 
parameter  uncertainties  in  the  design  of  robust,  implementable  compensa¬ 
tion  for  large-order  systems. 
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ABSTRACT 


First-order  necessary  conditions  for  quadratically-optimal  reduced-order 
modelling  of  linear  time-invariant  systems  are  derived  in  the  form  of  a  pair  of 
modified  Lyapunov  equations  coupled  by  an  oblique  projection  which  determines  the 
optimal  reduced-order  model.  This  form  of  the  necessary  conditions  considerably 
simplifies  previous  results  of  Wilson  ([1])  and  clearly  demonstrates  the  quadratic 
extremality  and  nonoptimality  of  the  balancing  method  of  Moore  ([2]).  The 
possible  existence  of  multiple  solutions  of  the  optimal  projection  equations  is 
demonstrated  and  a  relaxation-type  algorithm  is  proposed  for  computing  these  local 
extrema.  A  cost  component  analysis  of  the  model-error  criterion  similar  to  the 
approach  of  Skelton  ([3])  is  utilized  at  each  iteration  to  direct  the  algorithm  to 
the  global  minimum. 
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The  problem  of  approximating  a  high-order  linear  dynamical  system  with  a 

relatively  simpler  system,  i.e.,  the  model-reduction  problem,  has  received 

considerable  attention  in  recent  years.  Among  the  myriad  papers  devoted  to  this 

problem  are  the  notable  contributions  of  Wilson  ([1]),  Moore  ([2])  and  Skelton 

([3])  with  which  the  present  paper  is  concerned.  In  his  1970  paper,  Wilson 

proposed  an  optimality-based  approach  to  model  reduction  which  involves  minimizing 

the  steady-state,  quadratically-weighted  output  error  when  the  original  system  and 

reduced-order  model  are  subjected  to  white-noise  inputs.  For  the  resulting 

parameter  optimization  problem  he  obtained  first-order  necessary  conditions  which 

have  the  form  of  an  aggregation  (as,  e.g.,  [4])  and  which  involve  the  solution  of 

two  unwieldly  nonlinear  matrix  equations  each  of  order  n+n  ,  where  n  and  n  are 

m  m 

the  orders  of  the  original  and  reduced-order  models,  respectively  ([5,  6]). 

Some  time  later  Moore  proposed  a  quite  different  approach  to  model 
reduction  based  upon  system-theoretic  arguments  as  opposed  to  optimality  criteria. 
Using  the  eigenvalues  of  the  product  of  the  controllability  and  observability 
gramians  (which  satisfy  nxn  Lyapunov  equations),  his  method  identifies  subsystems 
which  contribute  little  to  the  impulse  response  of  the  overall  system.  Such  "weak" 
subsystems  are  thus  eliminated  to  obtain  a  reduced-order  model.  This  technique, 
known  as  balancing,  has  been  vigorously  developed  in  the  recent  literature 
([7-11]).  Since  this  approach  is  completely  independent  of  any  optimality 
considerations,  however,  there  is  no  guarantee  that  such  reduced-order  models  are 
in  any  sense  optimal. 

A  third  approach  to  model  reduction,  proposed  by  Skelton  ( [ 3, 12 ] ),  also 
utilizes  a  quadratic  optimality  criterion  as  in  [1].  However,  rather  than 
proceeding  from  necessary  conditions  as  does  Wilson,  Skelton  determines  for  a  given 
basis  the  contribution  (cost)  of  each  state  in  a  decomposition  of  the  error 
criterion  and  truncates  those  with  the  least  value.  Although  this  approach  is 
guided  by  optimality  considerations,  no  rigorous  guarantee  of  optimality  is 
possible  because  of  dependence  on  the  choice  of  state  space  basis. 
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The  present  paper  has  five  main  objectives,  the  first  of  which  is  to 
show  how  the  complex  optimality  conditions  of  Wilson  (see  (A.7)-(A.12)  of  [1])  can 
be  transformed  without  loss  of  generality  into  much  simpler  and  more  tractable 
forms.  The  transformation  is  facilitated  by  exploiting  the  presence  of  an  oblique 
(i.e.,  nonorthogonal)  projection  which  was  not  recognized  in  [1]  and  which  arises 
as  a  direct  consequence  of  optimality.  The  resulting  "optimal  projection 
equations"  constitute  a  coupled  system  of  two  nxn  modified  Lyapunov  equations  (see 
(2.13),  (2.14)  or  (2.21),  (2.22))  whose  solutions  are  given  by  a  pair  of  rank-n^ 
controllability  and  observability  pseudogramians.  The  highly  structured  form  of 
these  equations  gives  crucial  insight  into  the  set  of  local  extrema  satisfying  the 
first-order  necessary  conditions. 

The  second  objective  of  the  paper  is  to  show  how  the  optimal  projection 
equations  provide  a  rigorous  extremality  context  for  Moore's  balancing  method  and 
to  clearly  demonstrate  its  nonoptimality.  Although  for  some  problems  the  "weak 
subsystem"  hypothesis  leads  to  a  nearly  optimal  reduced-order  model,  we  construct 
examples  for  which  the  reduced-order  model  obtained  from  the  balancing  method  is 
much  worse  with  respect  to  the  least  squares  criterion  than  the  quadratically- 
optimal  reduced-order  model.  In  general,  all  that  can  be  said  is  that  the 
presence  of  a  weak  subsystem  indicates  that  the  reduced-order  model  obtained  by 
truncation  in  the  balanced  basis  may  be  in  the  proximity  of  an  extremal  of  the 
optimal  model-reduction  problem;  however,  this  extremal  may  very  well  be  a  global 
maximum.  It  should  be  noted  that  in  a  recent  paper  ([13]),  Kabamba  has  used 
bounds  on  the  model  error  to  demonstrate  the  quadratic  nonoptimality  of  the 
balancing  method. 

The  third  objective  of  the  paper  is  to  demonstrate  via  an  example  the 

mechanism  responsible  for  the  existence  of  multiple  extrema  of  the  optimal 

model-reduction  problem.  By  characterizing  the  optimal  projection  as  a  sum  of 

rank-1  eigenprojections  of  the  product  of  the  rank-deficient  pseudogramians,  it  is 

immediately  clear  that  the  first-order  necessary  conditions  of  the  problem  are 

ambiguous  in  the  sense  that  they  fail  to  specify  which  nm  eigenprojections 

comprise  the  optimal  projection  corresponding  to  a  solution  (i.e.,  global  minimum) 

of  the  optimal  model-reduction  problem.  Specifically,  since  the  pseudogramians  can 

be  rank  deficient  in  (  ”  ]  *  ~  , , n*  : ,  ways,  there  may  be  precisely  this  many 
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extremal  projections  corresponding  to  an  identical  number  of  local  extrema. 


The  fourth  objective  of  the  paper  is  to  propose  a  numerical  algorithm 
for  solving  the  optimal  projection  equations  by  exploiting  their  structure  and 
taking  advantage  of  the  available  insights.  By  expressing  the  modified  Lyapunov 
equations  in  the  form  of  'standard*  Lyapunov  equations,  an  iterative 
relaxation-type  algorithm  is  developed.  The  crucial  aspect  of  the  proposed 
algorithm  involves  extracting  an  oblique  projection  at  each  step  from  the  product 
of  the  Lyapunov  equations.  Since  |n  |  ran^~nm  Pr°3ec^i°ns  can  b®  extracted  from 

the  product  of  two  nxn  positive-definite  matrices,  it  is  quickly  evident  that  the 
criterion  by  which  the  n^  eigenprojections  are  chosen  determines  which  of  the 
numerous  local  extrema  will  be  reached,  if,  for  example,  the  projection  is  chosen 
in  accordance  with  the  nm  largest  eigenvalues  of  the  product  of  the  solutions  of 
the  Lyapunov  equations,  then  it  should  not  be  surprising  in  view  of  the  previous 
discussion  that  a  global  maximum  may  very  well  be  reached.  In  this  case  the  first 
iteration  of  this  algorithm  involves  Lyapunov  equations  whose  solutions  are  the 
controllability  and  observability  gramians  and  the  eigenvalues  in  question  are 
precisely  the  squares  of  the  second-order  modes  ([2],  p.  24).  Thus  the  first 
iteration  coincides  with  the  (nonoptimal)  balancing  approach  of  [2]. 

Since  the  optimal  projection  equations  are  a  consequence  of 
differential  (local)  properties,  it  should  not  be  expected  that  they  alone  would 
possess  the  inherent  ability  to  identify  the  global  minimum.  Moreover,  because  of 
the  number  of  local  extrema,  second-order  necessary  conditions  appear  to  be 
useless.  Instead,  we  investigate  an  approach  which  chooses  the  eigenprojections 
according  to  a  cost-component  analysis  of  the  model-error  criterion.  This 
technique  can  lead  to  a  global  minimum  by  effectively  eliminating  the  local 
extrema  which  have  considerably  greater  cost  than  the  global  minimum.  This 
approach  is  philosophically  identical  to  the  component  cost  analysis  of  Skelton 
([3,12]).  Essentially,  then,  component  cost  analysis  is  utilized  at  each 
iteration  to  direct  the  algorithm  to  the  global  minimum.  Although  our  application 
of  this  technique  is  admittedly  heuristic,  it  should  be  noted  that  it  is 
essentially  proposed  as  a  device  for  efficiently  *sorting  out*  the  local  extrema 
which  satisfy  the  otherwise  mathematically  rigorous  necessary  conditions.  Hence 
we  propose  component  cost  analysis  as  a  crucial  step  in  bridging  the  gap  between 
local  extremality  and  global  optimality. 
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It  should  be  pointed  out  that  neither  the  numerical  algorithm  proposed 
in  this  paper  nor  the  iterative  algorithm  developed  in  [4]  and  [5]  has  been  proven 
to  be  convergent.  The  principal  contribution  of  the  present  paper,  however,  is 
not  a  particular  proposed  algorithm  but  rather  the  revelations  concerning  the 
structure  of  the  first-order  necessary  conditions.  The  proposed  numerical 
algorithm  should  be  considered  but  a  prelude  to  a  full  investigation  into 
numerical  algorithms  for  the  optimal  projection  equations.  It  should  also  be 
noted  that  the  presence  of  the  optimal  projection  was  not  exploited  in  developing 
the  iterative  algorithms  in  [4]  and  15]  (in  fact,  it  was  not  even  recognized  in 
[1])  and  hence  crucial  insight  into  local  extrema  was  lacking. 

The  fifth  and  last  objective  of  the  paper  is  to  point  out  the 
connection  between  the  optimal  projection  equations  for  model  reduction  obtained 
herein  and  the  first-order  necessary  conditions  obtained  recently  for  two  closely 
related  problems,  namely,  reduced-order  state  estimation  and  fixed-order  dynamic 
compensation. 

The  plan  of  the  paper  is  as  follows.  Section  2  begins  with  general 
notation  and  definitions  followed  by  the  model-reduction  problem  statement  and  the 
Main  Theorem  which  presents  the  optimal  projection  equations  for  model  reduction. 

A  series  of  remarks  considers  various  aspects  of  the  Main  Theorem  and  sets  the 
stage  for  discussing  connections  with  [1]  and  [2J.  Section  3  contains  a  detailed 
discussion  of  the  sense  in  which  the  optimal  projection  equations  simplify  the 
necessary  conditions  given  in  [1],  and  section  4  shows  how  the  approach  of  [2]  is 
approximately  extremal.  Section  5  presents  a  simple  example  which  clearly 
displays  the  possible  existence  of  multiple  extrema  satisfying  the  optimal 
projection  equations.  This  example  shows  that  the  balancing  method  of  [2]  may 
lead  to  a  nonoptimal  reduced-order  model  and  suggests  a  heuristic  procedure  for 
selecting  the  eigenprojections  comprising  the  projection  corresponding  to  the 
global  minimum,  i.e.,  the  optimal  projection.  In  section  6  a  numerical  algorithm 
for  solving  the  optimal  projection  equations  is  proposed  and  applied  to  an 
illustrative  example  considered  previously  in  [1]  and  [2]  as  well  as  to  some 
interesting^xamples  considered  recently  by  Kabamba  in  [13].  Suggestions  for 
further  research  are  given  in  section  7  and  the  proof  of  the  Main  Theorem  appears 
in  the  Appendix. 
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2.  Problem  Statement  and  Main  Result 


The  following  notation  and  definitions  will  be  used  throughout  the 
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R,  R 
stable  matrix 

nonnegative-definite  matrix 

positive-definite  matrix 

nonnegative-semisimple  matrix 

positive-semisimple  matrix 

positive-diagonal  matrix 
n,  m,  /,  n 

m 

X'  u'  V  ym 

A,  B,  C 

A  ,  B  ,  C 
m  m  m 


R,  V 


rxr  identity  matrix 

transpose  of  vector  or  matrix  Z 

(Z  )  or  (Z  ) 

rank  of  matrix  Z 

trace  of  square  matrix  Z 

[tr  ZZT]1/2 

( i, j) -element  of  matrix  Z 

rxr  diagonal  matrix  with  listed 

diagonal  elements 

matrix  with  unity  in  the  (iri) 

position  and  zeros  elsewhere 

expected  value 

real  numbers,  rxs  real  matrices 
matrix  with  eigenvalues  in  open  left 
half  plane 

symmetric  matrix  with  nonnegative 
eigenvalues 

symmetric  matrix  with  positive 

eigenvalues 

matrix  similar  to  a 

nonnegative-definite  matrix 

matrix  similar  to  a  positive-definite 

matrix 

diagonal  matrix  with  positive  diagonal 
elements 

positive  integers,  1  in  S.  n 

m 

n,  m,  l,  n  ,  /-dimensional  vectors 
m 

nxn,  nxm,  fxn  matrices 

n  xn  ,  n  xm,  fxn  matrices 
m  m  m  m 

txf,  mxm  positive-definite  matrices 


We  consider  the  following  problem 


Optimal  Model-Reduction  Problem.  Given  the  controllable  and  observable 


system 


x  ■  Ax  +  Bu, 


y  =  Cx 


(2.1) 


(2.2) 


find  a  reduced-order  model 


x  =  A  x  +  B  u, 
m  mm  m 


(2.3) 


(2.4) 


y  ■  c  x 

m  mm 


which  minimizes  the  quadratic  model-reduction  criterion* 


J(Am,Bm,cm)  «  Urn  B[(y-yni)TR(y-yB)J, 
t— 00 

where  the  input  u(t)  is  white  noise  with  positive-definite  intensity  v.  To 
guarantee  that  J  is  finite  it  is  assumed  that  A  is  stable  and  we  restrict  our 
attention  to  the  set  of  admissible  reduced-order  models 


A  ■  J(A  ,B  ,c  ) :  A  is  stable  l. 
—  <  m  m  m  m  • 


Since  the  value  of  J  is  independent  of  the  internal  realization  of  the  transfer 
function  corresponding  to  (2.3)  and  (2.4),  we  further  restrict  our  attention  to 


the  set 


A  ^  {(A  ,B  ,C  )  t.  A^:  (A  ,  B  )  is  controllable  and 
+  <  m  m  m  —  mm 

( A  ,C  )  is  observablei. 
m  m  i 


*J  will  occasionally  be  referred  to  as  the  "model-reduction  error"  or,  simply,  as 
the  "cost." 
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The  following  lemma  is  needed  for  the  statement  of  the  main  result 


Lemma  2.1.  Suppose  Q,  £  e  Rnxn  are  nonnegative  definite.  Then  ^  is 

—  nmxr 

nonnegative  semisimple.  Furthermore,  if  p(^£)  =  n  then  there  exist  G ,P  €  R 


n  xn 

and  positive-semisimple  M  €  R  such  that 


$$  =■  gtm77 


fb  =  i  . 
nm 


(2.5) 


(2.6) 


Proof.  By  Theorem  6.2.5,  p.  123  of  [14]  there  exists  nxn  invertible 
£  such  that  the  nonnegative-definite  matrices  Da  =  and  Da  =  £  T,p$  1  are 

both  diagonal.  Hence  d,qD£,  is  nonnegative  definite  and  QP  =  4>  D^D£$  is  nonnegative 
semisimple.  Next  introduce  nxn  orthogonal  U  to  effect  a  rearrangement  of  basis  if 


necessary  so  that 


-  * 


r  3  -  • 


where  $  =  and  n  xn,  A  is  positive  diagonal.  Hence,  for  all  n„xn„  invertible  s, 
m  m  r  3  mm 


AA 

QP  *  ^ 


"g" 

(S^ASHS"1  0]#-1 

_0_ 


and  thus  (2.5)  and  (2.6)  hold  with  G  =  [ST  0 ]*T,  M  *  S_1AS  and  7” »  [S_1  0]*-1.  ° 


For  convenience  in  stating  the  Main  Theorem  we  shall  refer  to  G,/’e  I* 


n  xn 
m  m 


and  positive-semisimple  M  €  R  satisfying  (2.5)  and  (2.6)  as  a 

_ 

(G,M u)~f actorization  of  QP.  Also,  define  the  positive-definite  controllability 
and  observability  gramians 

00  Ip 

„  A  f  At„.  T  At,,. 

W  *  /  e  BVB  e  dt. 


„  a  f  At„.  T  At,,. 
W  *  /  e  BVB  e  dt, 

c  Jo 

W  ft  r eATtCTRCeAtdt, 

o  Jn 


which  satisfy  the  dual  Lyapunov  equations 


0 


T  T 

AW  +  W  A  +  BVB  , 
C  C 


(2.7) 


0  =  atw  +  w  a  +  ctrc. 
o  o 


(2.8) 


Main  Theorem,  suppose  (A  ,B  ,C  )  e  A.  solves  the  optimal 

model-reduction  problem.  Then  there  exist  nonnegative-definite  matrices 

$  «  Rnxn  such  that,  for  some  (G/M^-factorization  of  A  ,  B  and  c 

mm  m 

are  given  by 


A  «/Wr, 
m 

B  *  Fb, 
m 

T 

C  -  CG  , 
m 

A  T 

and  such  that,  with  T  =  G  the  following  conditions  are  satisfied: 

p($)  -  p($)  •  p(Q P)  =  n  , 

m 

o  =  t(a£  +  0aT  +  bvbT], 
o  *  [A^  +  1>a  +  ctrc]t. 


(2.9) 

(2.10) 

(2.11) 

(2.12) 

(2.13) 

(2.14) 


Several  comments  are  in  order.  First  note  that  the  Main  Theorem 

consists  of  necessary  conditions  in  the  form  of  two  modified  Lyapunov  equations 

(2.13)  and  (2.14)  plus  rank  conditions  (2.12)  which  must  possess  nonnegative- 

definite  solutions  6,  P  when  an  optimal  reduced-order  model  exists.  We  shall  call 

0  and  $  the  controllability  and  observability  pseudogramians.  respectively,  since 

they  are  analogous  to  W  and  W  and  yet  have  rank  deficiency.  The  modified 

c  o 

Lyapunov  equations  are  coupled  by  the  nxn  matrix  T  which  is  a  projection 
( idempotent  matrix)  since 


Note  that  in  general  T  is  an  oblique  projection  and  not  necessarily  an  orthogonal 
projection  since  it  may  not  be  symmetric.  We  shall  refer  to  a  projection  T 
corresponding  to  a  solution  (i.e.,  global  minimum)  of  the  optimal  model-reduction 
problem  as  an  "optimal  projection."  It  should  be  stressed  that  the  form  of  the 
optimal  reduced-order  model  (2.7)— (2.9)  is  a  direct  consequence  of  optimality  and 
not  the  result  of  an  a  priori  assumption  on  the  structure  of  the  reduced-order 
model. 


Since  the  optimal  projection  equations  are  first-order  necessary 
conditions  for  optimality,  they  may  possess  multiple  solutions  corresponding  to 
various  local  extrema  such  as  local  maxima,  local  minima,  saddle  points,  etc.  Th 
following  definition  will  prove  useful. 

Definition  2.1.  Nonnegative-definite  0,  P  e  RnXn  are  extremal  if 

(2.12)-(2.14)  are  satisfied.  (A_,B_,C_)  t  A.  is  extremal  if  there  exist 

m  m  m  — ►  - 

extremal  0/  £  such  that  (Am,Bm,Cm)  is  given  by  (2.9)— (2.11)  for  some 
(G,M,/")-factorization  of  The  corresponding  projection  T  is  an  extremal 

projection. 


Proposition  2.1.  Suppose  (A  ,B.C  )  is  extremal.  Then  the 

m  m  m 

model-reduction  error  is  given  by 

J(Am,Bm,Cin)  =  2tr[  ($£  -  WcWo)A].  (2.15) 

Proof.  The  proof  is  given  at  the  end  of  Appendix  A.  □ 

Remark  2.1.  Noting  the  identities 


-2tr [W  w  A]  -  tr [ C  RCW  ] 
co  c 


which  follow  from  (2.7)  and  (2.8), 


(A  , B  ,C  )  as 
m  m  m 


=  tr[BVBTW  ], 
o 

(2.15)  can  be  written  for  extremal 


(2.16) 


J(A  ,B  , C  )  -  2tr['0^A]  +  tr[CTRCW  ]  =  2tr[0$A]  +  tr[BVBTW  ].  (2.17) 


Foe  convenience  in  the  following  discussion,  let  $,  $,  G,  M,  P  and  T 

correspond  to  some  extremal  (A  ,B  ,C  ).  Now  observe  that  if  x  is  replaced 

m  m  m  m 

by  Sx^  then  an  'equivalent*  reduced-order  model  is  obtained  with 
replaced  by  (SAmS"1/SBiri,CmS_1 ) .  Since  =  J(SAmS  1,SBm,CmS  1) 


m  mm 


one  would  expect  the  Main  Theorem  to  apply  also  to  (SAmS  ^SB^C^S  1).  Indeed, 


the  following  result  shows  that  this  transformation  corresponds  to  the  alternative 
factorization  QP  =  (S~TG)T(SMS_1)  (s7~)  and,  moreover,  that  all  (G,M,/”)-f actorizations  of 
are  related  by  a  nonsingular  transformation. 


n  xn  _  _T  _  ^ 

Proposition  2.2.  If  S  f  8  is  invertible  then  G  =  S  G,  P  =  sP and 


M  =  SMS  satisfy 


ftp  *  GnP, 


P(P  =  I  . 

nm 


(2.5)  ’ 


(2.6)  ' 


_  _  n  xn  _  '  nmxnm 

Conversely,  if  G,  r  e  I?  m  and  invertible  M  e  jR  satisfy  (2.5)  '  and  (2.6)  ' 

“  nmxnm  —  _T  _  _ 

then  there  exists  invertible  S  €  I*  such  that  G  =  S  G,  P  -  S P and  M  =  SMS  . 

Proof .  The  first  part  is  immediate.  The  second  part  follows  by  taking 
S  =  M  VgTm,  noting  S  1  =  M/^M  1  and  using  the  identities  rGTM7"GT  =  M 


and  MjTG  —  M.  Q 


The  next  result  shows  that  there  exists  a  similarity  transformation 

/V\ 

which  simultaneously  diagonalizes  QP  and  T. 


Proposition  2.3.  There  exists  invertible  ^  f  8  such  that 


(2.18) 


(2.19a,b) 
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where  A^,  A^,  €  £  m  m  are  positive  diagonal,  A  £  A^A^  and  the  diagonal 

elements  of  A  are  the  eigenvalues  of  M.  Consequently, 

6  -  T$,  £  =  £r.  (2.20) 

Proof.  By  Theorem  6.2.5,  p.  123  of  [14],  and  by  (2.12)  there  exists 
nxn  invertible  0  such  that  (2.18)  holds  and  thus  (2.19a)  also  holds.  Define 


G  =  [I  O]0"T,  M  =A  andf-  [I  0)0 
n  n 

m  m 

so  that  (2.5)'  and  (2.6)'  are^atisf ied.  By  the  second  part  of  Proposition  2.2 
there  exists  invertible  S  e  £  m  m  such  that  G  »  STG,  M  =■  S-1MS  and  /"*  ■  S-1/"  . 
Now  (2.19b)  follows  from 


T  = 


gY 


G  T  -  0 


-1 


m 


Lo 


D 


It  is  useful  to  present  an  alternative  form  of  the  optimal 
model-reduction  equations  (2.13)  and  (2.14).  For  convenience  define  the  notation 


Proposition  2.4.  Equations  (2.13)  and  (2.14)  are  equivalent, 
respectively,  to 

o  *  +  6at  +  bvbt  -  t^bvbVJ, 

o  =  at£  +  $a  +  ctrc  -  t^ctrct1. 

Proof.  By  (2.20),  (2.21)  -  (2.13)  +  (2.13)T  +  (2.13)Tand 
(2.13)  =  T(2.21).  Similarly,  (2.14)  and  (2.22)  are  equivalent.  □ 


(2.21) 

(2.22) 


Remark  2.2.  Noting  the  identities 


which  follow  from  ( 2 . 20 )— (2.22) ,  (2.17)  can  be  written  for  extremal  (A  ,B  ,C  )  as 

m  m  m 


J (A  ,B  ,C  )  -  tr[CTRC(W  -$)]  =  tr[BVBT(W  -£] . 
in  m  m  c  o 


(2.24) 


To  facilitate  the  discussion  in  the  following  sections,  we  consider  the 


change  of  basis  x  *4x,  where  $  is  given  by  Proposition  2.3.  Writing  (2.1)  and 


(2.2)  as 


A  AA  A 

X  -  AX  +  Bu, 


AA 

y  *  cx. 


where 


A  *  ♦A*"1,  B  -  *B,  C  •  C ♦~1, 


(2.9)  -  (2.11)  become 


AAAT 
A  -  TAG  , 
m 


AA 

B  -  TB# 
m 


c 

m 


where 


A  a  -1  A  a  _.T 

r  -r*  ,  g  -  G0 


satisfy 


(2.25) 


(2.26) 


(2.27) 


(2.28) 


(2.29) 


(2.30) 


hav;  v  ;v; vw.v w  Mviiwew?f7h/vr?t'  , 


Note  that  (2.30)  implies 


A  A  -T 

r  -  is  o],  g  *  is  o], 


(2.31) 


for  some  n  xn  invertible  S.  Partitioning 
m  m 


A 

x  = 


A 

X 


m 


A 


A 

A 


A 

A 


m 


m2 


2m 


22 


A 

/  B 


A 

B 


m 


A 

B_ 


A 

,  C 


rA  A  , 

I cm  C,  , 
m  2 


A  HI  A  n  a 

where  x  €  R  and  A  ,  B  and  C  are  n  xn  ,  n  xm  and  xxn  ,  respectively, 
m  =  mm  m  mmm  m 


(2.27)-(2.29)  and  (2.31)  yield 


A  -1  A  A  -1 

A  =  SA  S  ,  B  *  SB  ,  C  =  C  S  . 
m  m  m  mmm 


This  shows  that  the  optimal  reduced-order  model  (modulo  a  state  transformation) 


can  be  obtained  by  truncating  the  last  n-nm  states  of  the  original  system  when  it  is 
expressed  in  the  basis  with  respect  to  which  &  and  P  have  the  diagonal  forms. 


expressed  in  tne  oasis  witn  respect  to  wnicn  u  ana  r  nave  tne  aiayunai  minis  p,  g-« 

|^Q  and  ^P  .  Since  the  optimal  projection  T  has  the  simple  form  |p Dni  QJ 


in  this  basis,  we  shall  refer  to  (2.25)  and  (2.26)  as  an  optimal  projection 
realization  of  (2.1)  and  (2.2).  Note  that  when  (2.21)  and  (2.22)  are  expanded  in 
an  optimal  projection  basis  (i.e.,  a  basis  corresponding  to  an  optimal  projection 
realization)  they  assume  the  form 


A  .  AT  A  AT 

0  =  A  A_  +  AM  +  0raVB  . 
m  Q  Q  m  mm 


„  A  A  AT 

0  =  A^  AA  +  B  VB  , 
2m  Q  2  m 


0  =  £T\A  +  Aa£  +  c^rc  , 

m  P  Pm  mm 


,  A  AT  A 
0  “  AAA  _  +  CRC,. 
P  m2  m2 


(2.32) 

(2.33) 

(2.34) 

(2.35) 


If  $  in  Proposition  2.3  is  replaced  by 


which  corresponds  to  a  change  of  basis  for  the  reduced-order  model  obtained  by 
truncation/  then  Aa  and  are  both  replaced  by  (A^A£)  and  hence  this 
can  be  called  a  balanced  optimal  projection  basis,  utilizing  the  terminology  of 
[2].  Thus,  in  a  balanced  optimal  projection  realization,  Aa  and  AA  appearing 
in  (2.32)-(2.35)  are  equal. 

The  next  result  provides  an  interesting  closed-form  characterization  of 
an  extremal  projection  in  terms  of  the  Drazin  generalized  inverse  of  QP.  Since 
(QP1)2  =  GTM 2r ,  and  hence  p($$j2  =  ($Pj  ,  the  'index*  of  q£  (see  [15],  p.  121) 

is  1.  In  this  case  the  Drazin  inverse  is  traditionally  called  the  group  inverse 

AA  #  \A  # 

and  is  denoted  by  (QP)  ([15],  p.  124).  Since,  as  is  easily  verified,  (QP)  * 

GTm  1r,  (2.6)  leads  to  the  following  result. 

Proposition  2.5.  An  extremal  projection  j-  is  given  by 


T  *  QP 


(2.36) 


An  alternative  representation  for  an  extremal  projection  will  prove 

useful  for  developing  a  numerical  algorithm  for  solving  (2.21)  and  (2.22).  If 
rxr 

Q,  P  €  JR  are  nonnegative  definite  then  by  Lemma  2.1  QP  is  nonnegative 

rx  r 

semisimple  and  thus  there  exists  invertible  \peP  such  that 
QP  =  Or  1Q  ty  , 

where  Q  =  diag (w^, . . . )  and  >  0  are  the  eigenvalues  of  QP.  Now  define 
the  ith  eigenprojection  ([16],  p.  41) 

/7ilQP]  -  , 

which  is  a  rank-1  oblique  projection.  Note  that  QP  has  the  decomposition 


Qp  -  E  WiffilQPl. 
i-1  1  1 

Proposition  2.6.  An  extremal  projection  T  is  given  by 


T-  £  n  i  (qp). 
i-1  1 


(2.37) 


where  the  ith  eigenpro jection  77.  IQP]  corresponds  to  the  ith  nonzero  eigenvalue 

A  A  * 


v  ^  AA 

\i  of  QP. 


3.  Relationship  to  Wilson1 s  Form  of  the  Necessary  Conditions 


The  optimal  model-reduction  problem  considered  in  the  previous  section 

is  identical  to  the  problem  considered  by  Wilson  in  [1]  with  the  minor  exception 

T 

that  he  sets  R  -  Ip.  In  [1]  G  and  r  are  denoted  by  6  2  and^,  (2.6)  appears  as 

(15)  and  (2.9)-(2.11)  are  given  by  (14a, b).  Note  that  in  11],  9 ^  and  9 2  depend 

upon  the  solutions  of  a  pair  of  (n+nm)x^n+nm^  Lyapunov  equations  (see  (7),  (9) 

of  [ 1  ]  or  (A. 2),  (A. 3)  of  the  present  paper)  whose  coefficients  and  nonhomogeneous 

terms  depend  in  turn  on  A  ,  B  and  C  (see  (A.7)-(A.12) ).  The  advantage  of 

m  m  m 

the  nxn  optimal  projection  equations  (2.21)  and  (2.22)  over  the  form  of  the 

necessary  conditions  given  in  [1]  is  that  the  former  are  independent  of  Am» 

and  Cm.  Moreover,  the  optimal  projection  T,  which  was  not  recognized  in  [1], 

can  be  seen  to  play  a  fundamental  role  by  coupling  the  modified  Lyapunov  equations 

(2.21)  and  (2.22)  and  determining  (since  T*  G TD  A  ,  B  and  C  in  (2.7)-(2.9). 

m  m  m 

4.  Relationship  to  Moore's  Balancing  Method 


In  contrast  to  Wilson's  method  for  model  reduction  which  is  based  on 
optimality  principles,  the  approach  due  to  Moore  >l/j)  relies  on  system-theoretic 
ideas.  The  main  thrust  of  this  approach  "is  to  eliminate  any  weak  subsystem  which 
contributes  little  to  the  impulse  response  matrix”  ([2],  p.  26).  The  concept  of  a 
■weak  subsystem”  is  defined  by  means  of  a  dominance  relation  ([2],  p.  28) 
involving  similarity  invariants  called  second-order  modes.  Moore  evaluates 
reduced-order  models  obtained  in  this  way  by  computing  the  relative  error  in  the 
impulse  response  given  for  MIMO  systems  by  ([2],  p.  29) 


€ (A  , B  ,C  ) 
m  m  m 


f  |  |H  (t)  I  |2dt/  ft  lH(t>  I  |2<3t 
0  6  J0 


]l/2 

' 


where  H  (t)  »  H(t)-H  (t),  H(t)  -  R1/2CeAtBV1/2  and  H  (t)  -  R1/2C  e  m  B  V1/2. 
e  m  m  m  m 

To  discuss  this  approach  in  the  context  of  the  optimal  model-reduction  problem  we 

assume  that  V  ■  I  and  R  -  I». 

m  X 


nr  nr  m 


lV 


k 


B 


i 


t 


Proposition  4.1.  Suppose  (A 


l  1/2 

€(A  ,B  ,C  )  -  [*4j(A  iB  fC  )/tr(H  W  A] 
mmm  2  m  m  m  co 


T  1/2 

=  [ J(A  ,B  ,c  )/tr(C  RCW  ) ] 
mmm  c 


(4.1) 


=  [ J( A  , B  ,C  )/tr(BVBTW  )]1/2. 
mmm  o 


Proof.  The  result  follows  from  (A.l),  (A. 8)  and  (A. 9)  which  hold 
without  regard  to  either  optimality  or  extremality.o 


Note  that  Proposition  4.1  shows  that  the  relative  error  in  the  impulse 


response  is  minimized  precisely  when  J(A  .B  .c  )  is  minimized.  Actually, 

mmm 


this  result  is  to  be  expected  since,  as  shown  in  [1],  J  can  be  obtained 
alternatively  by  taking  u(t)  to  be  an  impulse  at  t  *  0. 


To  draw  interesting  comparisons  with  the  results  of  [2],  choose  nxn 


T  —I 

invertible  such  that  V  W  V  and  #  W  #  are  both  diagonal  and  hence 

c  o 


W  W  *  V  -1£2  , 

c  o 


(4.2) 


where  L  =  diagtcr^, . . .  ,(rn)  and  the  second-order  modes  cr^  (i.e.,  the  positive 
square  roots  of  the  eigenvalues  of  WcWQ)  satisfy  or,  >  o"2  >  ...  >  crn  >0. 
This  transformation  corresponds  to  replacing  (2.1),  (2.2)  by 


x  *  Ax  +  Bu, 


(4.3) 


Cx, 


(4.4) 


where 


x  »<Px,  A  =  tfA^"1,  B  ■  O'B,  C  -  or1. 


(4.5) 


The  transformed  system  (4.3),  (4.4),  called  a  principal  axis  realization  {[17]), 
can  further  be  chosen  so  that 


tf'W  -  *TW  i/'-1  -  Z, 
c  o 


(4.6) 


i.e.,  the  balanced  realization.  Using  (4.5),  (2.7)  and  (2.8)  become 


0  «  AX  +  Lt?  +  BVp, 


(4.7) 


_jp  —  — r  — 

0  -  A  L+  JA  +  C  RC. 


(4.8) 


The  model-reduction  procedure  suggested  in  [2]  involves  partitioning 


m 

Am2 

2m 

X22 

,  C  =  [C  Cl, 
m  2 


where  xm  €  ii  and  Am,  Bm  and  Cm  have  corresponding  dimension,  and  extracting 

the  reduced-order  model  (A^B^Cm) .  Hence  the  reduced-order  model 

(Am,Bm,Cm)  is  extracted  from  (4.3),  (4.4)  in  essentially  the  same  way  the 

optimal  reduced-order  model  is  extracted  from  (2.25),  (2.26). 

To  see  how  the  optimal-projection  realization  compares  to  a  principal-axis 

realization,  first  note  that  (2.13)  and  (2.14)  are  satisfied  by  ft  “  W  and  $  *  W 

c  o 

when  the  rank  conditions  (2.10)  are  ignored.  Indeed,  since  Wc  and  WQ  are 

positive  definite,  the  rank  conditions  (2.12)  do  not  hold.  If,  however,  the 

system  (2.1),  (2.2)  is  expressed  in  the  balanced  coordinate  system  (4.3),  (4.4)  (so 

that  Vf  *  W  *  E) ,  then  the  assumption  <x  »  <r  implies  that  p(W  ),  p(W  )  and 
co  n  n+1  c  o 

m  m 

p(WcWo)  are  "approximately"  equal  to  nm  and  thus,  in  this  sense,  condition  (2.10) 

is  satisfied.  This  observation  leads  to  the  suggestion  that  when  tr  »  <r  . ,  ,  W 
—  n  n  c 

m  in 

and  wQ  are  approximations  to  solutions  ft  and  P  of  the  optimal  projection  equations 

and  the  reduced-order  model  (A  , B  ,C  )  of  Moore  is  an  approximation  to  some 

m  m  m 

extremal  (A  ,B  ,C  ).  There  is  no  guarantee,  of  course,  that  any  particular 
m  m  m 

extremum  corresponds  to  the  global  minimum,  or  even  to  a  local  minimum. 


5.  Existence  of  Multiple  Extrema  and  Cost-Component  Rankin 


In  this  section  we  show  by  means  of  a  simple  example  that  the  optimal 
projection  equations  may  possess  nonunique  solutions  corresponding  to  multiple 
extrema,  e.g.,  local  minima  or  maxima.  We  also  show  how  decomposing  the  cost  can 
identify  the  global  minimum  from  amongst  the  numerous  extrema.  To  begin,  let 
m-f-n,  R-V-I  , 


A  $  diag(-o, 


where  >  0,  i=l,...,n,  and  suppose  B  and  C  are  such  that 

BB  *  diag  (j3^  ,  •  •  •  ,  ,  C  C  •  diag  ( ,  •  •  •  #  t 

where  /9L  >0,  >  0,  i*l,...,n.  Hypothesizing  diagonal  solutions  $  and  9  of 

(2.21)  and  (2.22)  leads  to 


S 

i 


9 


where  each  5^>  i=l,...,n  is  either  zero  or  one  and  exactly  nn  of  the  8^'s  are  equal 
to  one.  Hence  T  «*  diag(  8. , .. .,  8  ).  Note  that  there  are  |  ]  such  solutions  of  the 

1  /n  \  \V 

optimal  projection  equations  corresponding  to  f  J 


local  extrema. 


since  wc  We  Q^r wCj  p  =rw0 


and  A,  W  and  W  commute,  (2.15)  becomes 
c  o 


j(a  ,b  ,c  )  -  tr  rj./r'&B^cTc. 
m  m  m  * 


Hence 

n 

J(A,B  ,C  )  -  £  &<!-«,>'  (5‘1) 

m  m  m  . — ,  l  l 

i=l 


where 


ii 4  *iVJv 


To  minimize  J  it  is  clear  that  8^  should  be  chosen  to  be  unity  for 
the  largest  n^  elements  of  the  set  *»a  zero  otherwise.  Although  this 

choice  is  not  necessarily  unique,  it  does  yield  a  global  minimum.  Note  that 

choosing  8.  ■  1  is  equivalent  to  selecting  a  particular  eigenpro jection 

1  2 
^[WcWo]  corresponding  to  the  eigenvalue  ($^y^/4a^* 

Remark  5.1.  The  expression  in  (5.1)  can  be  regarded  as  a  decomposition 
of  the  cost  in  terms  of  the  state  variables.  The  idea  of  deleting  states  based  on 
their  "component  costs"  is  precisely  the  "component  cost  analysis"  approach  of 
Skelton  (13,12]). 

Using  the  example  it  is  easy  to  see  that  the  balancing  method  of  [2], 
which  selects  eigenprojections  based  upon  the  magnitude  of  the  eigenvalues  of 
WcWQ,  i.e.,  the  (squares  of  the)  second-order  modes,  may  yield  a  grossly 
suboptimal  reduced-order  model.  To  this  end  let 

«1  -  a2  *  10*,  -  1  ,  P2  *  10*,  y1  -  1,  y2  «  io3 

so  that 

*1  "  *2  *  r06- 

Clearly  J  is  minimized  (J  «  by  choosing  8^  *  0,  82  ■  1,  which 
corresponds  to  truncating  the  first  state  variable.  If,  however,  the  method  of 
[2]  is  utilized,  then  judging  by  the  second-order  modes 


<r2  •  (2.5)1/2*10“2  =  .012, 


the  second  state  variable  should  be  deleted.  This,  however,  corresponds  to 
choosing  *  1,  82  ■  0  with  the  much  higher  cost  J  »  £2«  The  fact  that  the 
balancing  approach  of  [2]  fails  to  determine  a  solution  of  the  optimal 
model-reduction  problem  should  not  be  surprising  in  view  of  the  fact  that  the 
error  criterion  plays  no  role  in  the  balancing  technique. 


Although  the  above  solution  exploited  the  simple  structure  of  this 
example,  it  is  clear  that  choosing  the  global  minimum  from  amongst  the  local 
extrema  involves  an  eigenpro jection  decomposition  of  the  cost  J.  To  extend  this 
idea  to  more  general  systems,  we  invoke  the  following  heuristic  approximation. 

Approximation  5.1.  Let  define  the  balanced  basis  as  in  (4.6).  Then 
^  also  approximately  defines  a  balanced  optimal  projection  basis,  i.e., 


~  &~TP  tf'*1  =  tL2, 


(5.2) 


where  extremal 


T  =  *  diag(6  ) 


(5.3) 


£«i-v 

i*l 


Proposition  5.1.  If  approximation  5.1  holds  for  extremal 


(A  ,B  ,C  )  then,  with  T  *  I  -  T  , 
m  m  m  in 


J(A  ,B  ,C  )  -2tr  [  T.  £  ^A] 
m  m  m  1 

■  2  p- 

i*l 


(5.4) 


Remark  5.2.  From  (4.7)  and  (4.8)  it  follows  that  (5.4)  can  be  written 


either  as 


J(A  ,B  ,C  )  =  trlT^Bvi1) 
m  m  m  i 


(5.5) 


Ec r  bvb  ) . .  ( l  -8.) 
1  11  1 

1*1 


'.vVaU1  *  -* ■" .  .  *’ -  \  .  ■*.  ■" .  7".  -  ,  7'..  •*.  7*.  -  .  C.  ..  7' 


(5.6) 


it 


* 


J(A  ,  B  ,C  )  trfTlXRC] 
m  m  m  1 


n 


Y,  cr.(C^RC)  (1-6.). 
i=l 


Hence/  Approximation  5.1  leads  to  the  following  cost-component  ranking 


(again#  in  the  sense  of  Skelton  [3# 12])  of  the 
projection  equations. 


CO 


extrema  satisfying  the  optimal 


Cost- Component  Ranking.  Assume  Approximation  5.1  is  valid  and  choose 
the  eigenpro jections  comprising  extremal  T  such  that 


5.  *  1#  if  -o?A. .  is  among  the  n_  largest  elements  of  the  set  {-crX  ,  ; 
i  i  li  m  3  l  r  rrjr=l 


8^  =  0#  otherwise. 


For  comparison  purposes  we  shall  also  consider  the  following  ranking  of 


the  eigenpro jections  based  upon  the  eigenvalues  of  WcWq  (i.e.#  second-order  modes). 


Eigenvalue  Ranking.  Choose  the  eigenpro jections  comprising  extremal  T 


such  that 


8.  *  1#  if  cr  is  among  the  n  largest  elements  of  the  set  <cr> 
li  m  \  rfr*l 


8^  *  0#  otherwise. 


Remark  5.3.  The  observation  that  the  second-order  modes  alone  may  be  a 
poor  guide  to  determining  an  optimal  reduced-order  model  has  recently  been  made  in 
[13]  where  bounds  on  the  model-error  criterion  were  given  involving  both  the 
second-order  modes  and  suitable  weights  called  balanced  gains.  It  can  be  seen 
from  Proposition  5.1  that  the  role  of  balanced  gains  in  our  approach  is  played  by 


the  elements  -o-jA^  when  Approximation  5.1  holds.  It  can  also  be  seen  that 
the  balanced  gains  of  Kabamba  yield  bounds  on  the  component  costs  of  Skelton. 
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6.  Numerical  Solution  of  the  Optimal  Projection  Equation 

Insofar  as  the  ultimate  aim  of  any  model-reduction  technique  is  to 
permit  the  development  of  numerical  procedures  for  reducing  high-order  models,  the 
optimal  projection  equations,  comprising  a  coupled  system  of  modified  Lyapunov 
equations,  appear  promising  in  this  regard.  Therefore,  we  present  an  iterative 
computational  algorithm  that  exploits  the  structure  of  these  equations  and  the 
available  insights.  The  reader  is  strongly  reminded  that  the  proposed  algorithm 
is  but  a  first  attempt  at  solving  these  new  equations  and  alternative  algorithms 
may  yet  be  devised.  The  basis  of  this  algorithm  is  the  ability  to  write  the 
modified  Lyapunov  equations  (2.21),  (2.22)  in  the  form  of  ’standard*  Lyapunov 
equations  (6.1),  (6.2)  such  that  the  pseudogramians  ^  and  $  are  extracted  at  the 
final  step  (6.6).  It  follows  from  (2.32)-(2.35)  that  (2.21),  (2.22)  are  indeed 
equivalent  to  (6.1),  (6.2)  (with  k  =  °°  )  and  (6.6). 


Algorithm: 


Step  1:  Initialize  *  I 


o.  0  ,  .  A(k)  A(k) 

Step  2:  Solve  for  Q  ,  P  : 


(A  -  T^ATf1)^’  +  §(I°(A  -  T,kV1k))T  *  BVB1, 


(6.1) 


(A  -  Tilk)AT(k))TP(k)  +  P(k>(A  -  T|k)AT(k))  +  CTRC;  (6.2) 


Step  3:  Balance: 


0(k)^(k)(0(k))T  ^  (0(k)}—  T$(k)(0(k)}—  1  _  ^(k)^ 

r(k)  ,Jk)  (k),  Jk)  >  Jk)  >  >  (k)  >  n 

L  *  diag(<7-  ,...,  cr  ),  tr  >  cr  >  ...  >  cr  c  0; 
J.  n  i  i  n 

Step  4:  If  k  >1  check  for  convergence: 


(6.3) 


tr(C  RCW  ) 
c 


-  tr(cTRC  ^k,§(k)(TU))T) 


tr(c  Rcw_) 

c 


(6.4) 


22 


If  |e^  -  <  tolerance  then  go  to  Step  8; 


Else  continue; 


Step  5:  Select  nm  eigenpro jections : 


/7(k)t§(k)§{k)] .  /7<k)[£(k)£(k)], 

X1  1n 

m 


/7.[t(k1(k)]  &  S(k)E.(S(k))-1; 


Step  6:  Update: 


_( k+1) 


n 

m 

L 

r=l 


J7!k'llH(k'l, 

1  r 


(6.5) 


Step  7:  Check  for  convergence;  if  not,  increment  k  and  return  to  Step  2; 


Step  8:  Set: 


A  J  ao  )  A  (  oo  )  T  A  (  *  )  T&J  oo  ) 

Q  *  T  Q(T  )  ,  P  *  (T  )  PT 


(6.6) 


_  ...  , . ( k)  (k)  (k) . 

For  convenience  we  shall  adopt  the  notation  (A  ,B  ,C 

m  m  m 


where  k  >0,  to  denote  the  reduced-order  model  obtained  as  a  result  of  applying  the 
( k ) 

projection  T  and  we  define 


a  ,,.(k)  „( k)  „( k) 


€k  *  €(Am  'Bm  9  Cm  )f 

K  m  m  m 


(K)  (k)  (k) 

i.e.,  the  relative  error  associated  with  (A  ,B  ,C  ).  Note 

m  m  m 


that  in  general  -ek  *=  ek  since  ek  denotes  the  relative  error  only  when 


convergence  has  been  reached. 


It  should  be  clear  from  the  discussion  in  the  previous  section  that  the 
crucial  step  of  the  algorithm  is  Step  5,  the  choice  of  the  eigenpro jections.  For 


the  examples  which  follow  we  shall  invoke  consistently  at  Step  5  either  the 
cost-component  ranking  based  upon  Approximation  5.1  or  the  eigenvalue  ranking. 

Remark  6.1.  Note  that  in  the  special  case  R  =  I  and  V  =  I a,  the 

A/n\  A.  .  m  X 

first  iteration  of  the  algorithm  yields  0  =  Wc,  P  =  Wq.  If,  at  Step 

5,  we  choose  i  *  r,  r  =  l,...,n  ,  i.e.,  the  eigenpro jections  are  selected 

r  m  (1)  (1)  (1) 
according  to  the  eigenvalue  ranking,  then  (Am  ,Bm  ,C  is  precisely 

the  reduced-order  model  obtained  from  balancing. 

We  shall  first  consider  the  following  example  which  was  treated  by  both 
Wilson  and  Moore.  In  this  example  and  those  that  follow  assume  R  =  Im,  V  =  Ip. 

Example  6.1 


"0 

0 

0 

-150‘ 

'4' 

1 

0 

0 

-245 

,  B  = 

1 

0 

1 

0 

-113 

0 

_0 

0 

1 

-  19, 

.0. 

Table  1  summarizes  the  results  obtained  for  the  three  cases 
n^  =  3,2,1  utilizing  the  eigenvalue  ranking.  In  each  case  the  proposed 
algorithm  converged  linearly  in  less  than  eight  iterations  and  in  each  case 
improvement  is  evident  over  previously  published  results.  As  pointed  out  in  [2], 
Wilson's  result  seems  to  imply  a  lack  of  final  convergence.  For  this  example  the 
balancing  approach  yields  a  reduced-order  model  close  to  the  global  minimum. 


Table  1.  Relative  Error  eM  =  €x 


Order  n. 


Wilson  [1] 


Moore  [2] 


Optimal  Projection 
Equations 


We  now  turn  to  a  pair  of  interesting  examples  considered  in  [13] 


Example  6.2« 

» --ill]-  --[xi.]*  =  -“T- 

Table  2  summarizes  the  results  obtained  using  the  eigenvalue  ranking 
and  Table  3  gives  the  results  when  the  cost-component  ranking  is  used.  It  is 
clear  that  the  former  method  directs  the  algorithm  to  the  global  maximum  whereas 
the  latter  approach  yields  the  global  minimum. 

Example  6.3. 

::44  ■-[£). 

Table  4  reports  the  results  obtained  using  either  the  cost-component 
ranking  or  the  eigenvalue  ranking  which  agree  for  this  example.  If  the 
alternative  eigenprojection  is  selected  then,  as  expected,  the  algorithm  converges 
to  a  global  maximum  (see  Table  5).  The  interesting  aspect  of  this  example,  as 
discussed  in  [13],  is  that  the  error  €1  *  .5245  for  the  reduced-order  model 
obtained  by  either  eigenprojection  ranking  is  actually  greater  than  ■  .3849 
obtained  by  choosing  the  alternative  reduced-order  model.  This  situation  seems  to 
indicate  that  proper  eigenprojection  selection  based  upon  a  cost  decomposition  is 
able  to  direct  the  algorithm  to  the  global  minimum  in  cases  for  which  the  starting 
values  are  not  nearby. 

Table  2.  Example  6.2  with  Eigenvalue  Ranking 

k  ek 

1 
2 


3 


9950371897 

9950371691 

9950371690 


Table  5.  Example  6.3  with  the  Opposite  Ranking 


k 


ek 


1  .7624928516 

2  .9999999961 

3  .9999999975 
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7.  The  Optimal  Projection  Equations  for  Fixed-Order  Dynamic  Compensation  and 
Reduced-Order  State  Estimation 


We  briefly  discuss  the  relationship  between  the  optimal  projection 
equations  for  model  reduction  and  analogous  results  for  reduced-order  control  and 
estimation  problems. 

Fixed-Order  Dynamic-Compensation  Problem.  Given  the  control  system 


x  *  Ax  +  Bu  +  W^, 
y  *  cx  +  w2 

design  a  fixed-order  dynamic  compensator 


x  =  A  x  +  By, 
c  c  c  cJ 

u  ■  C  x 
c  c 


which  minimizes  the  performance  criterion 

J(Ac,Bc,Cc)  =  lim  Ejx^x  +  utR2uJ, 
t^oo- 


(7.1) 

(7.2) 


(7.3) 

(7.4) 


(7.5) 


; 

I 


Z1 


i 


where  u  e  Rm,  x  e  R  C*  n  <  n,  w.  is  white  disturbance  noise,  w  is  nonsingular 
=  c  —  c  1  ^ 

white  observation  noise,  R^is  nonnegative  definite  and  R2  is  positive  definite. 
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Necessary  conditions  characterizing  optimal  (Ac,Bc,Cc)  have  been 
developed  in  [18-22]  along  the  same  lines  as  the  Main  Theorem.  These  conditions, 
called  the  optimal  projection  equations  for  fixed-order  dynamic  compensation, 
consist  of  four  matrix  equations  (two  modified  Riccati  equations  and  two  modified 
Lyapunov  equations)  coupled  by  a  projection.  The  modified  Riccati  equations,  not 
surprisingly,  are  similar  in  form  to  the  covariance  and  cost  Riccati  equations  of 
LQG  theory  and  the  modified  Lyapunov  equations  are  similar  to  the  optimal 
model-reduction  equations  (2.13)  and  (2.14).  Hence,  while  the  modified  Riccati 
equations  govern  optimal  estimation  and  optimal  control,  the  additional  modified 
Lyapunov  equations  characterize  ’optimal  reduction".  The  important  fact  that  all 
four  equations  are  coupled  supports  the  view  that  optimal  fixed-order  dynamic 
compensators  cannot  in  general  be  designed  by  means  of  a  stepwise  procedure,  e.g., 
by  either  open-loop  model  reduction  followed  by  LQG  or  LQG  followed  by  closed-loop 
model  reduction. 


Mid  way  between  the  model-reduction  and  fixed-order  dynamic- 
compensation  problems  lies  the  following  problem. 


Reduced-Order  State-Estimation  Problem.  Given  the  system 


x  *  Ax  +  w^, 
y  *  Cx  +  w  , 


(7.6) 

(7.7) 


design  a  reduced-order  state  estimator 


x  -  A  x  +  B  y, 
e  e  e  eJ 

y  *  C  x  , 

•'e  e  e' 


(7.8) 

(7.9) 


which  minimizes  the  estimation  criterion 


J(A  ,B  ,c  )  «  lim  E  [(Lx-y  )TR(Lx-y  )], 
e'  e'  e  —  e  e 

t-**  oo 


ne  pxne 

where  xgc  ,  L£jR  and  L  identifies  the  states,  or  linear  combinations  of  states 

whose  estimates  are  desired.  The  order  n  of  the  estimator  state  x  is 

e  e 

determined  by  implementation  constraints,  i.e.,  by  the  computing  capability 
available  for  realizing  (7.8)  and  (7.9)  in  real  time. 


In  view  of  the  results  already  given  it  should  not  be  surprising  (see 
[23])  that  the  optimal  projection  equations  for  reduced-order  state  estimation 
form  a  system  of  three  matrix  equations  (a  pair  of  modified  Lyapunov  equations 
along  with  a  single  modified  Riccati  equation)  coupled  by  a  projection  which 
determines  the  gains  of  the  optimal  reduced-order  estimator.  This  intrinsic 
coupling  between  the  'operations*  of  optimal  estimation  (the  modified  Riccati 
equation)  and  optimal  model  reduction  (the  pair  of  modified  Lyapunov  equations) 
stresses  the  fact  that  reduced-order  estimators  designed  by  means  of  either  model 
reduction  followed  by  "full-order*  state  estimation  or  full-order  estimation 
followed  by  estimator  reduction  will  generally  not  be  optimal  for  the  given  order. 

8.  Directions  for  Further  Research 

The  most  important  area  of  research  involves  the  further  development  of 
algorithms  for  solving  the  optimal  projection  equations.  Although  proving  local 
convergence  of  the  proposed  algorithm  appears  possible,  the  more  important  problem 
is  achieving  global  optimality  via  the  component  cost  approach.  Although  the 
global  minimum  was  attained  for  all  examples  attempted  by  the  authors,  it  remains 
to  treat  considerably  more  complex  systems. 

An  interesting  extension  of  the  Main  Theorem  involves  the  case  in  which 
the  original  system  (2.1),  (2.2)  is  a  distributed  parameter  system,  e.g.,  a 
partial  differential  equation  or  a  functional  differential  equation.  This 
generalization,  which  has  been  referred  to  as  the  "ultimate  reduced-order  problem* 
([24]),  may  lead  to  the  efficient  generation  of  high-order  discretizations  for 
such  systems.  All  of  the  mathematical  machinery  required  to  generalize  the  Main 
Theorem  to  this  case  has  already  been  applied  to  fixed-order  dynamic  compensation 
in  ([25]). 

9.  Conclusion 

First-order  necessary  conditions  for  quadratically  optimal 
reduced-order  modelling  of  a  linear  time-invariant  plant  are  expressed  in  the  form 
of  a  pair  of  nxn  modified  Lyapunov  equations  coupled  by  an  oblique  projection. 

This  form  of  the  necessary  conditions  considerably  simplifies  the  original  form 
given  by  Wilson  in  [1]  and  clearly  reveals  the  possible  presence  of  numerous 
extrema.  The  balancing  method  of  Moore  given  in  [2]  is  shown  to  yield  a 
reduced-order  model  that  is  'close*  to  an  extremal  given  by  the  necessary 


conditions.  A  numerical  example  shows,  however,  that  this  extremal  may  very  well 
be  the  global  maximum  rather  than  the  desired  global  minimum.  An  algorithm  is 
proposed  which  exploits  the  presence  of  the  optimal  projection  and  computes  the 
various  local  extrema  by  the  choice  of  eigenpro jections  comprising  the 
projection.  A  component-cost  ranking  of  the  eigenprojections,  which  is  very  much 
in  the  spirit  of  Skelton's  method  in  [3,12],  is  used  to  direct  the  algorithm  to 
the  global  optimum. 


Appendix:  Proof  of  the  Main  Theorem 


Introducing  the  augmented  system 


x  =  Ax  +  Bu, 
y  *  ex’. 


where 


y  -  y-ym. 


AO  B 

»  B  «  ,  "C  «  [C  -C  ), 

0  A  B  m 

m  m 


leads  to  the  expression 


J( A  ,B  ,c  )  -  tr  UR, 
m  m  m 


( A.  1) 


— ,  A  — T  . _ 

where  R  *  C  RC  and  the  nonnegative-definite  steady-state  covariance  Q  of  x  is 

given  by  the  (unique)  solution  of 


0  «  A?+  qS1  +  V , 


(A. 2) 


with  V*  -  Bvirr.  To  minimize  (A.l)  subject  to  the  constraint  (A. 2),  form  the 


Lagrangian 


L(A_,B_,C_,Q)  -  trtAQR  +  (A(F  +  +  V)Pr] 


(n+nm)x(n+n  ) 

with  multipliers  X  >  0  and  P  e  .R  .  since  A^  is  an  open  set  the 

standard  Lagrange  multiplier  rule  can  be  applied. 

Using  formulas  for  computing  partial  derivatives  ([26])  it  follows  that 

0  *  L~  ■  ATP  +  PA"  +  XR. 

Since  X  -  0  implies  ?  =  0  (recall  A  is  stable),  we  can  take  X  ■  1  without  loss  of 
generality.  Hence  is  the  (unique  nonnegative-definite)  solution  of 

0  =  ATP  +  PA  +  R.  (A. 3) 

Again  using  formulas  from  [26]  and  performing  some  manipulation  it  follows  that 


0  *  L  *Q  P  +QP, 
A  u12  12  T2' 

m 


0  ”  LB  ’  2<P12B  +  WV' 

m 

°  ■  lC  ■  2R<CA  -  C°12>' 
m 


where  Q  and  P  have  been  partitioned  as 


(A. 4) 

(A. 5) 
(A. 6) 

(A. 7) 


Since  (as  will  be  seen  shortly)  and  P2  are  positive  definite,  define 


(A. 9) 


A 

and  note  that  (A. 4)  implies  that  (2.5)  holds  with  M  *  Q2P2»  s*nce  Q2P2  * 

-1/2  1/2  1/2  1/2 

P2  '  (P2  Q2P2  )p2  •  M  *s  P°sitive  semisimple.  The  rank  conditions  (2.12)  follow 

from  Sylvester's  inequality.  Expanding  (A. 2)  and  (A. 3)  yields 


0  -  AQX  +  q/  +  BVBT, 


0  “  AQ12  +  Q12Am  +  BVBm' 


T  T 

0  -  A  Q„  +  Q  A  +  B  VB  , 
m2  2  m  mm 


0  *  ATP^  +  PXA  +  CTRC, 


0  *  ATP  +  PA  -  CTRC  , 
12  12  m  m 


(A. 10) 


(A. 11) 


(A. 12) 


(A. 13) 


(A. 14) 


0 


T  T 

A  P  +  P „A  +  C  RC  . 
m2  2  m  mm 


(A. 15) 


Since  A  is  stable  and  (A  ,B  )  is  controllable,  standard  results  (e.g., 
m  m  m 

i 27 ] ,  p.  277)  imply  that  Q2  is  positive  definite.  Similarly,  P2  is  positive 
definite. 


It  is  easy  to  see  at  this  point  that  A  , 

m 

independent  of  and  P^  and  thus  (A. 10)  and  (A. 13) 
substituting  (2.10),  (2.11)  and  the  identities 


B_  and  C  are 
m  m 

can  be  ignored. 


Now, 


\  T 

*  Qr  r 

P12  -  -K 

AT 

>  ror , 

A  T 

P2  -  GPG 

(A. 16) 

(A. 17) 


into  (A. 11),  (A. 12),  (A. 14)  and  (A. 15)  yields 


0 


A&* 


ATT  T  T 

+  or  a  +  bvb  r  , 

m 


(A. 18) 


A  T 
AmrQr 


AT  T 

+  ror  a 

m 


T  T 

+  TBVB  r 


0 


> 


(A. 19) 


0  =  A  PG  +PGA  +C  RCG  , 
in 


tat  at  t  t 

0  =  a  GPG  +  GPG  A  +  GC  RCG  . 
m  m 


(A. 20) 


(A. 21) 


Computing  (A.19)-r(A.18)  implies 


a  =  /'A$rr<r&rr)' 
m 


which,  since  PQl^  ■  Q2>  yields  (2.9).  Alternatively,  (2.9)  can  be  obtained 
from  (A . 21 )-G(A . 20 ) . 


If  we  now  substitute  (2.9)  into  (A.18)-(A.21)  and  use  the  easily 

verified  relations  (2.20),  it  follows  that  (A. 19)  ■/'(A.IS)  and  (A. 22)  ■  G(A.21) 

T  T 

and  thus  (A. 19)  and  (A. 21)  are  redundant.  Finally,  G  (A. 18)  and  (A.20)T 
yield  (2.13)  and  (2.14),  respectively.  Note  that  these  last  multiplications 
entail  no  loss  of  generality  since  p(G)  *  p  (T)  *  nm» 

To  show  that  the  optimal  projection  equations  entail  no  loss  of 

generality  over  (A.2)-(A.6),  let  Q,P  be  extremal  and  define  Qi2,  Q2,  Pi2,  P2 

AA 

by  (A. 16)  and  (A. 17)  for  some  (G,M,D-factor ization  of  QP  and  let  Q^,  P.^ 
satisfy  (A. 10)  and  (A. 13).  Then  it  is  straightforward  to  reverse  the  steps  taken 
in  the  proof  to  arrive  at  (A.2)-(A.6) .□ 

Proof  of  Proposition  2.1.  Extremal  $,$  leads  to  Q,P  as  in  (A. 7)  satisfying 
(A .2 )-(A.6  ) .  Computing 


J ( A  ,  B  , C  )  «  tr(Q,CTRC  -  2Q.,c3rc)  +  tr(Q_C^RC  ) 
m  m  m  1  12  m  2mm 

«*  tr  [CTRC(W  -q)  ], 
c 


noting  that  (2.13),  (2.14)  are  equivalent  to  (2.21),  (2.22)  because  of  (2.20)  and 
using  (2.23),  leads  to  (2.15). o 
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DYNAMIC  COMPENSATION  OF  INFINITE-DIMENSIONAL  SYSTEMS* 
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ABSTRACT 

One  of  the  major  difficulties  in  designing  implementable 
finite-dimensional  controllers  for  distributed  parameter  systems  is  that  such 
systems  are  inherently  infinite  dimensional  while  controller  dimension  is  severely 
constrained  by  on-line  computing  capability.  While  some  approaches  to  this 
problem  initially  seek  a  correspondingly  infinite-dimensional  control  law  whose 
finite-dimensional  approximation  may  be  of  impractically  high  order,  the  usual 
engineering  approach  involves  first  approximating  the  distributed  parameter  system 
with  a  high-order  discretized  model  followed  by  design  of  a  relatively  low-order 
dynamic  controller.  Among  the  numerous  approaches  suggested  for  the  latter  step 
are  model/controller  reduction  techniques  used  in  conjunction  with  the  standard 
LQG  result.  An  alternative  approach,  developed  in  [36],  relies  upon  the  discovery 
that  the  necessary  conditions  for  optimal  fixed-order  dynamic  compensation  can  be 
transformed  into  a  set  of  equations  possessing  remarkable  structural  coherence. 

The  present  paper  generalizes  this  result  to  apply  directly  to  the  distributed 
parameter  system  itself.  In  contrast  to  the  pair  of  operator  Riccati  equations 
for  the  'full-order*  LQG  case,  the  optimal  finite-dimensional  fixed-order  dynamic 
compensator  is  characterized  by  four  operator  equations  (two  modified  Riccati 
equations  and  two  modified  Lyapunov  equations)  coupled  by  an  oblique  projection 
whose  rank  is  precisely  equal  to  the  order  of  the  compensator  and  which  determines 
the  optimal  compensator  gains.  This  'optimal  projection’  is  obtained  by  a 
full-rank  factorization  of  the  product  of  the  finite-rank  nonnegative-definite 
Hilbert-space  operators  which  satisfy  the  pair  of  modified  Lyapunov  equations. 

The  coupling  represents  a  graphic  portrayal  of  the  demise  of  the  classical 
separation  principle  for  the  finite-dimensional  reduced-order  controller  case. 

The  results  obtained  apply  to  a  semigroup  formulation  in  Hilbert  space  and  thus 
are  applicable  to  control  problems  involving  a  broad  range  of  specific  partial  and 
functional  differential  equations. 


*This  work  was  performed  at  Lincoln  Laboratory/MIT  and  was  sponsored  by  the 
Department  of  the  Air  Force.  Both  authors  are  currently  with  Harris 
Corporation,  Government  Aerospace  Systems  Division,  Controls  Analysis  and 
Synthesis  Group,  Melbourne,  FL  32901. 


1 .  INTRODUCTION 


One  of  the  major  difficulties  in  designing  active  controllers  for 
distributed  parameter  systems  is  that  such  systems  are  inherently  infinite 
dimensional  while  implementable  controllers  are  necessarily  finite  dimensional 
with  controller  dimension  severely  constrained  by  on-line  computing  capability. 

As  pointed  out  by  Balas  ([1],  see  also  [2]),  control  design  for  distributed 
parameter  systems  entails  the  practical  constraints  of  1)  finitely  many  sensors 
and  actuators,  2)  a  finite-dimensional  controller  and  3)  natural  system 
dissipation.  The  validity  of  2)  is  apparent  from  the  fact  that  processing  and 
transmitting  electrical  signals  by  conventional  analog  or  digital  components 
constitutes  finite-dimensional  action.  Although  distributed  parameter  devices  can 
also  be  utilized,  their  fabrication  and  implementation  can  incorporate  at  most  a 
finite  number  of  design  specifications.*  Hence,  although  distributed  parameter 
systems  are  most  accurately  represented  by  infinite-dimensional  models,  real-world 
constraints  require  that  implementable  controllers  be  modelled  as  lumped  parameter 
systems. 

Clearly,  the  above  observations  effectively  preclude  the  possibility  of 
realizing  infinite-dimensional  controllers  that  involve  full-state  feedback  or 
full-state  estimation  (see,  e.g.,  [4-6]  and  the  numerous  references  therein). 
Although  finite-dimensional  approximation  schemes  have  been  applied  to  optimal 
infinite-dimensional  control  laws  ([7-9]),  these  results  only  guarantee  optimality 
in  the  limit,  i.e.,  as  the  order  of  the  approximating  controller  increases  without 


•Examples  of  such  components  include  tapped  delay  lines  and  surface  acoustic  wave 
devices.  Although  acoustoelectric  convolvers  ([3],  p.  465)  can  perform 
continuous-time  integration,  synthesis  of  the  desired  impulse  response  kernel  can 
incorporate  only  finitely  many  specified  parameters.  The  obvious  fact  should 
also  be  noted  that  physical  limitations  impose  an  upper  bound  on  the  number  of 
design  parameters  that  can  be  incorporated  in  the  construction  of  any  device. 


1 


bound.  Hence,  there  is  no  guarantee  that  a  particular  approximate  (i.e., 
discretized)  controller  is  actually  optimal  over  the  class  of  approximate 
controllers  of  a  given  order  which  may  be  dictated  by  implementation  constraints. 
Moreover,  even  if  an  optimal  approximate  finite-dimensional  controller  could  be 
obtained,  it  would  almost  certainly  be  suboptimal  in  the  class  of  all  controllers 
of  the  given  order. 

Although  the  usual  engineering  approach  to  this  problem  is  to  replace 
the  distributed  parameter  system  with  a  high-order  finite-dimensional  model, 
analogous,  fundamental  difficulties  remain  since  application  of  LQG  leads  to  a 
controller  whose  order  is  identical  to  that  of  the  high-order  approximate  model. 
Attempts  to  remedy  this  problem  usually  rely  upon  some  method  of  open-loop  model 
reduction  or  closed-loop  controller  reduction  (see,  e.g.,  [10-15]).  Most  of  these 
techniques  (with  the  exception  of  [11])  are  ad  hoc  in  nature,  however,  and  hence 
guarantees  of  optimality  and  stability  may  be  lacking. 

A  more  direct  approach  that  avoids  both  model  and  controller  reduction 
is  to  fix  the  controller  structure  and  optimize  the  performance  criterion  with 
respect  to  the  controller  parameters.  Although  much  effort  has  been  devoted  to 
this  approach  (see,  e.g.,  [16-30]),  progress  in  this  direction  has  been  impeded  by 
the  extreme  complexity  of  the  nonlinear  matrix  equations  arising  from  the 
first-order  necessary  conditions.  What  was  lacking,  to  quote  the  insightful 
re.aarks  of  [24],  was  a  ’deeper  understanding  of  the  structural  coherence  of  these 
equations.’  The  key  to  unlocking  these  unwieldy  equations  was  subsequently 
discovered  by  Hyland  in  [31]  and  developed  in  [32-36].  Specifically,  it  was  found 
that  these  equations  harbored  the  definition  of  an  oblique  projection  (i.e., 
idempotent  matrix)  which  is  a  consequence  of  optimality  and  not  the  result  of  an 
ad  hoc  assumption.  By  exploiting  the  presence  of  this  ’optimal  projection,’  the 
originally  very  complex  stationary  conditions  can  be  transformed  without  loss  of 
generality  into  much  simpler  and  more  tractable  forms.  The  resulting  equations 
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(see  (2.10)— (2.17)  of  (36])  preserve  the  simple  form  of  LQG  relations  for  the 
gains  in  terms  of  covariance  and  cost  matrices  which,  in  turn,  are  determined  by  a 
coupled  system  of  two  modified  Riccati  equations  and  two  modified  Lyapunov 
equations.  This  coupling,  by  means  of  the  optimal  projection,  represents  a 
graphic  portrayal  of  the  demise  of  the  classical  separation  principle  for  the 
reduced-order  controller  case.  When,  as  a  special  case,  the  order  of  the 
compensator  is  required  to  be  equal  to  the  order  of  the  plant,  the  modified 
Riccati  equations  immediately  reduce  to  the  standard  LQG  Riccati  equations  and  the 
modified  Lyapunov  equations  express  the  proviso  that  the  compensator  be  minimal, 
i.e.,  controllable  and  observable.  Since  the  LQG  Riccati  equations  as  such  are 
nothing  more  than  the  necessary  conditions  for  full-order  compensation,  the 
■optimal  projection  equations*  appear  to  provide  a  clear  and  simple  generalization 
of  standard  LQG  theory. 

The  fact  that  the  optimal  projection  equations  consist  of  four  coupled 

matrix  equations,  i.e.,  two  modified  Riccati  equations  and  two  modified  Lyapunov 

equations,  can  readily  be  explained  by  the  following  simple  reason.  Reduced-order 

control-design  methods  often  involve  either  LQG  applied  to  a  reduced-order  model 

or  model  reduction  applied  to  a  full-order  LQG  design,  and  hence  both  approaches 

require  the  solution  of  precisely  four  equations:  two  Riccati  equations  (for  LQG) 

plus  two  Lyapunov  equations  (for  system  reduction  via  balancing,  as  in  [12,14]). 

The  coupled  form  of  the  optimal  projection  equations  is  thus  a  strong  reminder 

that  the  LQG  and  order-reduction  operations  cannot  be  iterated  but  must,  in  a 

certain  sense,  be  performed  simultaneously.  This  situation  is  partly  due  to  the 

fl  O'! 

fact  that  the  optimal  projection  matrix  may  not  be  of  the  form  ^  I  even  in 
the  basis  corresponding  to  the  "balanced*  realization  ([12,14]).  This  point  is 
explored  in  [37]  where  the  solution  to  the  optimal  model-reduction  problem  is 
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characterized  by  a  pair  of  modified  Lyapunov  equations  which  are  also  coupled  by 
an  optimal  projection. 

Returning  now  to  the  distributed  parameter  problem,  it  should  be 
mentioned  that  notable  exceptions  to  the  previously-mentioned  work  on  distributed 
parameter  controllers  are  the  contributions  of  Johnson  ([38])  and  Pearson 
([39,40])  who  suggest  fixing  the  order  of  the  finite-dimensional  compensator  while 
retaining  the  distributed  parameter  model.  Progress  in  this  direction,  however, 
was  impeded  not  only  by  the  intractability  of  the  optimality  conditions  that  were 
available  for  the  finite-dimensional  problem  (as  in  [16-30]),  but  also  by  the  lack 
of  a  suitable  generalization  of  these  conditions  to  the  infinite-dimensional 
case.  The  purpose  of  the  present  paper  is  to  make  significant  progress  in  filling 
these  gaps,  i.e.,  by  deriving  explicit  optimality  conditions  which  directly 
characterize  the  optimal  finite-dimensional  fixed-order  dynamic  compensator  for  an 
infinite-dimensional  system  and  which  are  exactly  analogous  to  the 
highly-simplified  optimal  projection  equations  obtained  in  [31-34,36]  for  the 
finite-dimensional  case.  Specifically,  instead  of  a  system  of  four  matrix 
equations  we  obtain  a  system  of  four  operator  equations  whose  solutions 
characterize  the  optimal  finite-dimensional  fixed-order  dynamic  compensator. 
Moreover,  the  optimal  projection  now  becomes  a  bounded  idempotent  Hilbert-space 
operator  whose  rank  is  precisely  equal  to  the  order  of  the  compensator. 

The  mathematical  setting  we  use  is  standard:  a  linear  time-invariant 
differential  system  in  Hilbert  space  with  additive  white  noise,  finitely  many 
controls  and  finitely  many  noisy  measurements  (thus  satisfying  the  first  practical 
constraint  mentioned  above).  The  input  and  output  maps  are  assumed  to  be 
bounded.  Since  the  only  explicit  assumption  on  the  unbounded  dynamics  operator  is 
that  it  generate  a  strongly  continuous  semigroup,  the  results  are  potentially 
applicable  to  a  broad  range  of  specific  partial  and  functional  differential 
equations.  The  actual  applicability  of  our  results  is  essentially  limited  by 
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practical  constraint  3).  Since  we  are  concerned  with  the  steady-state  problem,  we 
implicitly  assume  that  the  distributed  parameter  system  is  stabilizable,  i.e., 
that  there  exists  a  dynamic  compensator  of  a  given  order  such  that  the  closed-loop 
system  is  uniformly  stable.  We  note  that  stabilizing  compensators  do  exist  for 
the  wide  class  of  problems  considered  in  [41]  and  [42]  which  includes  delay, 
parabolic  and  damped  hyperbolic  systems.  The  question  of  how  much  damping  is 
required  for  stabilizability  of  hyperbolic  systems  is  a  crucial  issue  in  designing 
controllers  for  large  flexible  space  structures  ([7,  43-49a]). 

It  is  important  to  point  out  that  the  results  of  this  paper  can 
immediately  be  specialized  to  finite-dimensional  systems  by  requiring  that  the 
Hilbert  space  characterizing  the  dynamical  system  be  finite-dimensional.  Then  all 
unboundedness  considerations  can  be  ignored,  ad joints  can  be  interpreted  as 
transposes  and  other  obvious  simplifications  can  be  invoked.  The  only 
mathematical  aspect  requiring  attention  is  the  treatment  of  white  noise  which,  for 
general  handling  of  the  infinite-dimensional  case,  is  interpreted  according  to 
[6].*  For  the  finite-dimensional  case,  however,  the  standard  classical  notions 
suffice  and  the  results  go  through  with  virtually  no  modifications. 

The  contents  of  the  paper  are  as  follows.  Section  2  contains 
preliminary  notation  in  addition  to  particular  results  for  use  later  in  the 
paper.  Section  3  presents  the  optimal  steady-state  finite-dimensional  fixed-order 
dynamic-compensation  problem  and  the  Main  Theorem  gives  the  necessary  conditions 


♦Alternatively,  we  could  have  adopted  the  white  noise  formulation  of  [4]. 
However,  this  would  have  required  additional  technical  assumptions  on  the  plant 
(see.  Theorem  5.35,  p.  148  of  [4]).  The  main  difference  between  the  two  white 
noise  formalisms  is  that  Balakrishnan  works  with  finitely  additive  rather  than 
countably  additive  measures.  Strictly  speaking,  then,  even  in  finite  dimensions 
Balakrishnan 's  white  noise  is  different  from  the  standard  notion  (see  [6],  pp. 
307,  315). 
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in  the  form  of  the  optimal  projection  equations  (3.15)— (3.18) •  We  then  develop  a 
series  of  results  which  serve  to  elucidate  several  aspects  of  the  Main  Theorem. 
Section  4  is  devoted  to  the  proof  of  the  Main  Theorem.  The  reader  is  alerted  to 
the  two  crucial  steps  required.  The  first  step  involves  generalizing  to  the 
infinite-dimensional  case  the  derivation  of  the  necessary  conditions  in  their 
•primitive*  form  (see  (4.27)— (4.29)  and  (4.48)-(4.53) ).  The  derivation  in  [31-33, 
36]  involving  Lagrange  multipliers  is  invalid  in  the  infinite-dimensional  case  due 
to  the  presence  of  the  unbounded  system-dynamics  operator.  Instead,  we  use  the 
gramian  form  of  the  closed-loop  covariance  operator  to  obtain  a  dual  problem 
formulation  and  then  proceed  to  derive  the  primitive  necessary  conditions  by  means 
of  a  lengthy,  but  direct,  computation  (Lemma  4.7).  The  second  crucial  step 
involves  transformation  of  the  primitive  form  of  the  necessary  conditions  to  the 
final  form  given  in  the  Main  Theorem.  This  laborious  computation  was  first 


2.  PRELIMINARIES 


In  this  section  we  introduce  general  notation  along  with  basic 
definitions  and  results  for  use  in  later  sections.  Our  principal  references  are 
[6],  [50]  and  [51]. 

Throughout  this  section  let  jj,*  H.'  and  Ji*  denote  real  separable  Hilbert 
spaces  with  norm  ||*M  and  inner  product  <•,•>  and  let  iL' )  denote  the  space 

of  bounded  linear  operators  from  J!  into  .H'.  For  L  e  ),  I  IL|  |  is  the  norm  of 

L,  R(L)  is  the  range  of  L,  .N(L)  is  the  null  space  of  L,  p(L)  is  the  rank  of  L  (set 
p(L)  -  eo  if  L  does  not  have  finite  rank),  L  *  is  the  inverse  of  L  when  L  is 

invertible,  i.e.,  when  L  has  a  bounded  inverse,  L*  is  the  adjoint  of  L  and 

-*  -1  * 

L  =  (L*)  .  Recall  that  ||L| |  =  II L* | |  and  that  p(L)  =  p(L  )  ([50],  p.  161). 

*  *  A 

If  LL  *  L  L  then  L  is  normal.  Now  suppose  that  H,  ■  JJ*  so  that  L  €  J3(.H)  -  ,B(.H».H). 

* 

If  L  *  L  then  L  is  selfad joint.  If  L  is  selfadjoint  and.<LX,x>  >  0,  x  e  Ji, 
then  L  is  nonnegative  definite.  Note  that  the  self ad jointness  assumption  is 
included  in  the  definition  since  the  Hilbert  spaces  are  assumed  real.  If  L  is 
nonnegative  definite  then  L‘  denotes  the  ( unique )  nonnegative-definite  square 
root  of  L.  Call  L  semisimple  (resp.,  real  semisimple,  nonnegative  semisimple)  if 
there  exists  invertible  S  e  JKJI.)  such  that  SLS  1  is  normal  (resp.,  selfadjoint, 
nonnegative  definite).  This  implies  that  SLS  1  has  a  complete  set  of 
orthonormal  eigenvectors  and,  in  the  real-semisimple  or  nonnegative-semisimple 
cases,  has  real  or  nonnegative  eigenvalues. 

Recall  that  if  S  €  £.(&)  is  compact  then  S  has  at  most  a  countable 
number  of  eigenvalues  and  all  nonzero  eigenvalues  have  finite  multiplicity. 

Hence,  for  L  €  ^(JH.^Ji* )  compact,  let  be  the  (at  most  countable)  sequence  of 

*  i 

eigenvalues  of  (LL  )!  with  appropriate  multiplicity  and  >  ct^  >  ...  >0  ([50], 

p.  261).  Then  £1(iL,iL' )  denotes  the  set  of  trace  class  (or  nuclear )  operators, 
i.e.,  the  set  of  L  for  which  00  ([50],  p.  521). 


B.jdLfiL' )  is  a  Banach  space 


with  norm 


!IL||  A  E « 

1  i  1 

If  Ea  -  <  00  then  L  £  fc^e  set  Hilbert-Schmidt  operators, 

i 

which  is  a  Banach  space  with  norm 

i ! l 1 1 2  =  tEai]T  * 
i 

Note  that  I  |L|  |  <  |  |L|  l2<  |  |L|  llf  I  |L|  |  =  I  |L*|  |,  I  I L |  \±  -  I  I L*  I  ^  and 
l |L| |  ■  | |L*| |  .  If  H  *  Ji',  then  we  write  B^H)  and  B.2(H)  for 

and  j£2(Ji,ji) ,  respectively.  Note  that  if  nonnegative-definite 
L  € ^(ji)  then  LZ££2(JL). 

If  L£  £!(£,£')  and  S  e  B.tii'  ,JL’ ' )  then 
I  |SL| \1<  |  IS  |  I  I  |L| \1 

and  hence  SL € (£,£' ) .  Similarly,  under  suitable  hypotheses, 

I  ILSI \1  <  MSI  |  |  I L |  tlf 

and 

I  I SL  |  | x  1  I  IS  I  I 2  |  IL | | 2. 

Lemma  2.1.  Suppose  L  €£.i(ii)  and  let  |^i|  denote  the  nonzero 
eigenvalues  of  L  with  appropriate  multiplicity.  Then  ([51],  p.  89) 

E  iv  ii  ii*i  4* 

i 

If  L  is  selfadjoint  ; nen  ([50],  p.  522) 

E  I I  -  I  I L | i1# 


If  L  is  nonnegative  definite  then 


i 


Let  Lefi^H.).  Then  define  ([50],  p.  523)  the  trace  functional 


tr:  fi1(£.)  —  R  by 


tr  L  -  £<L  4>.,  (f>i>  , 


where  the  summation  is  independent  of  the  choice  of  orthonormal  basis 


{♦*}■ 


The  trace  satisfies  tr  L  *  tr  L  ,  tr  SL  =  tr  LS  for  all  S  €  fi.(JL),  tr  ST  «  tr  TS 
for  all  S,T  e  £2(iU  and  tr(aT  +  /3s)  =  a(tr  T)  +  /3( tr  S)  for  all  a,/ 3  e  R  and 
S,T  € 

Lemma  2.2.  Suppose  L  C£i(iL)  and  let  j^i|  denote  the  nonzero 
eigenvalues  of  L  with  appropriate  multiplicity.  Then  ([51],  p.  139) 

tr  L  = 


and  hence 

Itr  Li  < | | L | |x. 

If  L  is  nonnegative  definite  then 
tr  L  =  | I L | |1# 

Corollary  2.1.  For  each  S€j5(ji)  the  linear  functionals 
L  -•  tr  SL:  B^H.)  —  IR  , 

L  —  tr  LS:  B^fi,)-*  JR 

are  continuous.  For  each  L€^(jj.)  the  linear  functionals 
S  —  tr  LS:  B(H)  -*  ]R  , 

S  -*  tr  SL:  B.(JL)  “4  R 
are  continuous. 

Although  showing  that  a  bounded  linear  operator  is  trace  class  is 
slightly  more  involved  than  the  above  characterizations  of  J^Ul),  the  following 
result  will  suffice  for  our  purposes  (see  [52],  p.  96,  or  [6],  p.  115). 
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Lemma  2.3.  Let  L  €.B(H.)  be  nonnegative  definite.  Then 

£<L  <k,  >  , 

i 

whether  finite  or  infinite,  is  independent  of  the  orthonormal  basis  .  The 

summation  is  finite  if  and  only  if  L€£.^(ji). 

Many  of  the  operators  introduced  in  the  following  section  have 
finite-dimensional  domain  or  range  space  and  hence  are  degenerate,  i.e.,  have 
finite  rank.  Recall  that  degenerate  operators  are  necessarily  trace  class.  The 
following  result,  which  generalizes  Theorem  2.1,  p.  240  of  [53]  in  certain 
respects,  will  be  fundamental  in  decomposing  finite-rank  operators. 

Lemma  2.4.  Suppose  L^,...,LC  e  BfH.jH.')  have  finite  rank.  Then 
there  exists  a  finite-dimensional  subspace  1JCH  such  that  L^M-1-  =  0,i=l,...,r. 
Furthermore,  if  jL  *  JSL'  then  &  can  be  chosen  such  that  L^H  C  H,i*l, .  •  •  ,r. 

Proof.  It  suffices  to  consider  the  case  r=l.  Writing  L  for  L^ ,  note 
that  since  p(L*)<oof  R( L ) 1  =  R.(L  )  ([50],  p*  155)  and  N.(L)  is  closed,  the  first 
statement  holds  with  Ji  «  iKL)1  .  When  H  *  iL*  set  M  =  H(L)i+  R(L)  and  note  that 
M1  *  H(L)  H  R(L)1  C  H(L)  and  LH  CE(L)  C  ft.  □ 

The  following  generalization  of  Sylvester's  inequality  ([54],  p.  66) 
will  be  used  repeatedly  in  handling  finite-rank  operators. 

Lemma  2.5.  Let  L  €  £(£.,&' )  and  S  €  £.( H. '  ,  ) .  Then 

p(SL)  <  min  |p(S),  p(L)j.  (2.1) 

If  dim  it*  *  v  <  «o,  then 

p(S)  +  p(L)  -  v  <  p(SL).  (2.2) 

Proof .  If  either  S  or  L  does  not  have  finite  rank  then  (2.1)  is 
immediate.  If  both  S  and  L  have  finite  rank  then  the  standard  arguments  ([54]) 
used  to  prove  the  finite-dimensional  version  of  (2.1)  remain  valid.  To  prove 


10 


(2.2),  note  that  Lemma  2.4  implies  that  there  exist  orthonormal  bases  for  £  and  Ji' 


."xp 


with  respect  to  which  L  has  the  matrix  representation  [L  0],  where  L  c  E 
Similarly,  there  exist  orthonormal  bases  for  and  H*  with  respect  to  which  S  has 


the  matrix  representation 


S 

0  ' 


where  s'  e  JR^Xl> , 


vxv 


Since  the  two  cited  bases  for  H/ 
may  be  different,  let  orthogonal  uelFL^be  the  matrix  representation  (with 
respect  to  either  basis  for  U.* )  for  the  change  in  orthonormal  basis  ([6],  p.  100). 


Hence  SL  has  the  matrix  representation 
result  ((54],  p.  66).  □ 


and  (2.2)  follows  from  the  known 


As  in  the  proof  of  Lemma  2.5,  we  shall  utilize  the  infinite-matrix 

representation  of  an  operator  with  respect  to  an  orthonormal  basis.  All  matrix 

representations  given  here  will  consist  of  real  entries  since  the  Hilbert  spaces 

involved  are  real.  When  the  orthonormal  bases  are  specified  and  no  confusion  can 

arise,  we  shall  not  differentiate  between  an  operator  and  its  matrix 

representation.  We  shall  use  the  infinite  identity  matrix  IH  interchangeably 

with  the  identity  I„  on  H.. 

XL 

When  dealing  with  finite-dimensional  Euclidean  spaces  the  notation  and 
terminology  introduced  above  will  be  utilized  with  only  minor  changes.  For 
example,  bounded  linear  operators  will  be  represented  by  matrices  whose  elements 
are  determined  according  to  fixed  orthonormal  bases  and  hence  we  identify 
IRmXn  -  £(  IRn,  JRm).  Note  that  if  L  c  l(Rn,iL)  and  S  e  £(£,  Rm)  then  SL 
is  an  mxn  matrix  which  is  independent  of  any  particular  orthonormal  basis  for  H. 
The  transposes  of  xe  E  =  R  and  M  €  R  are  denoted  by  x  and  M 

_T  A  _  2 

and  M  =  (M  )  Let  I  denote  the  nxn  identity  matrix. 

n 

To  specialize  some  of  the  above  operator  terminology  to  matrices,  let 

M  «Rnxn.  we  shall  say  M  is  nonnegative  (resp.,  positive)  diagonal  if  M  is 

diagonal  with  nonnegative  (resp.  positive)  diagonal  elements.  M  is  nonnegative 

T  T 

(resp.,  positive)  definite  if  M  is  symmetric  and  x  Mx  >  0  (resp.,  x  Mx  >0), 
x  €  Rn.  Recall  that  M  is  symmetric  (resp.,  nonnegative  definite,  positive 


■"•.I  ’J'A'I 


definite)  if  and  only  if  there  exists  orthogonal  U  e  IRnxn  such  that  UMUT  is 
diagonal  (resp.,  nonnegative  diagonal,  positive  diagonal).  M  is  semisimple  ([55], 
p.  13),  or  nondefective  ([56],  p.  375),  if  M  has  n  linearly  independent 
eigenvectors,  i.e.,  M  has  a  diagonal  Jordan  canonical  form  over  the  complex 
field.  M  is  real  (resp.,  nonnegative,  positive )  semisimple  if  M  is  semisimple 
with  real  (resp.,  nonnegative,  positive)  eigenvalues.  Note  that  M  is  real  (resp., 
nonnegative,  positive)  semisimple  if  and  only  if  there  exists  invertible  S  e  ]Rnxn 
such  that  SMS  1  is  diagonal  (resp.,  nonnegative  diagonal,  positive  diagonal). 
Alternatively,  M  is  real  (resp.,  nonnegative,  positive)  semisimple  if  and  only  if 
there  exists  invertible  S  €  Rnxn  such  that  SMS  1  is  symmetric  (resp., 
nonnegative  definite,  positive  definite). 


Lemma  2.6.  The  product  of  two  nonnegative-  (resp.,  positive-)  definite 
matrices  is  nonnegative  (resp.,  positive)  semisimple. 


,nxn 


Proof.  If  S,  L  €  ]R  are  both  nonnegative  (resp.,  positive)  definite 
then  by  Theorem  6.2.5,  p.  123  of  [55]  there  exists  invertible  <f>  €  ]Rnxn  such  that 

A  — 1  — T  A  T  1 

D  *  (f)  s </>  and  D  S  (f>  L<p  are  nonnegative  (resp.,  positive)  diagonal.  Hence, 

S  L* 

SL  =  <f)D„DT(f>  1  is  nonnegative  (resp.,  positive)  semisimple,  as  desired. 

O  U 

Alternatively,  if  either  S  or  L  is  positive  definite,  then  the  result  follows  from 
SL  =  L'1/2(L1/2SL1/2)L1/2  if  L  is  positive  definite  or  SL  =  S1'/2(S1/,2LS1/^2)S  1^2  if 
S  is  positive  definite.  □ 
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3.  PROBLEM  STATEMENT  AND  THE  MAIN  THEOREM 


We  consider  the  following  steady-state  fixed-order  dynamic-compensation 
problem.  Given  the  dynamical  system  on  [0,«°) 


x(t)  =  Ax ( t )  +  Bu { t )  +  HjWft), 


(3.1) 


y(t)  =  Cx (t )  +  H2w(t), 


(3.2) 


design  a  finite-dimensional  fixed-order  dynamic  compensator 


x  (t)  =  A  x  (t)  +  B  y (t ) , 
c  c  c  c 


(3.3) 


U ( t )  =  C  X  (t) 
c  c 


(3.4) 


which  minimizes  the  steady-state  performance  criterion 


J(A  B  Cc)  =  lim  EKR1x(t),x(t)>  +  u(t)  R2u(t)], 
t  — oo  “  ^ 


(3.5) 


The  following  data  are  assumed.  The  state  x(t)  is  an  element  of  a  real 

separable  Hilbert  space  H.  and  the  state  differential  equation  is  interpreted  in 

the  weak  sense  (see,  e.g.,  [6],  pp.  229,  317).  The  closed,  densely  defined 

At 

operator  A:  J)(A)  C  H.  -*  H.  generates  a  strongly  continuous  semigroup  e  ,  t  >  0. 

The  control  u(t)  e  IR."1,  B  e  .BCE^/H.)  and  the  operator  R^  e  E.j(H,)  and  the  matrix 
R2 c  IR.mxm  are  nonnegative  definite  and  positive  definite,  respectively.  w(*)  is  a 
zero-mean  Gaussian  'standard  white  noise  process'  in  L2<  ( 0,oo)  ,H_' )  (see  [6], 

o 

p.  314),  where  H'  is  a  real  separable  Hilbert  space,  e  H2  e  —  ' 

* 

and  'E'  denotes  expectation.  We  assume  that  H^H2  =  0,  i.e.,  the  disturbance  and 

.  *  a 

measurement  noises  are  independent,  and  that  V2  =  H2H2  €  IR.*  is  positive  definite, 

.  * 

i.e.,  all  measurements  are  noisy.  Note  that  =  H^H^  €  J3^(_H)  is  nonnegative 


definite  and  trace  class.*  The  initial  state  x(0)  is  Gaussian  and  independent  of 
w(*).  The  observation  y(t)  t  1r/  and  C  €  JS(iL,IR^).  The  dimension  of  the 

compensator  state  x^ft)  is  of  fixed,  finite  order  nc  <  dim  _H  and  the  optimization 

n  xn  n  xf  mxn 

is  performed  over  Ac  €  IR  c  C/Bce  °  and  C  e  C  • 

To  handle  the  closed-loop  system  (3.1)— (3.4),  we  introduce  the 

nc 

augmented  state  space  =  H©]R  which  is  a  real  separable  Hilbert  space  with 
inner  product  <x\  ,1c0>  =  <x,,x_>  +  x  Fx  ,,  x-.  =  (x.,x  .  ).  An  operator  L  e  B(1T)  has 

1  Z  1  Z  Cl  CZ  1  1  Cl  - 

n  n 

c  c 

a  "decomposition*  into  operators  €  ,B(.H) ,  L^2  e  B(IR  ,H. ) ,  L  e  B(JH,IR  )  and 
n  xn 

L2  e  JR.  in  the  sense  that  for  3?  =  (x,xc)  £  H,  Lx  =  (L^x  +  Li2xc'L21x  +  L2xc), 

or,  in  "block"  form. 


For  later  use  note  that 


and 


|L| |  <| | L 


"L12" 


"L2l" 


iil22ii 


We  can  similarly  construct  unbounded  operators  in  H,.  Hence,  define 

the  closed-loop  dynamics  operator  A:  D.( A )  C  2  — •  H.  on  the  dense  domain 
n 

—  A  c  —  — 

£(A)  =  fi.(A)  x  IR  by  Ax’  =  (Ax  +  BC  x  ,B  Cx  +  A  x  ).  Since  A  can  be  represented  by 


BC 


Ta  o"l 

0  0  + 


B  C 
c 


*We  must  require  that  Rj  and  V}  be  nuclear  since  covariance  operators  in  the 
white  noise  formulation  of  [6]  are  not  necessarily  nuclear  as  they  are  in  the 
formulation  of  14]. 


and  since  the  closed-loop  operator  ^  QJ  :  I)(A)  -»  H_  generates  the  strongly 

At 

e  0 

continuous  semigroup  ,  t  >  0,  it  follows  from  Theorem  2.1,  p.  497 

n 

c 

of  [50]  that  X  is  also  closed  and  generates  a  strongly  continuous  semigroup 

Xt  — 

e  e  B_(_H ) ,  t  >  0.  To  guarantee  that  J  is  finite  and  independent  of  initial 
conditions  we  restrict  our  attention  to  the  set  of  admissible  stabilizing 
compensators 


A  =  |(ac,Bc,Cc):  eAt  is  exponentially  stable^. 

Hence  if  (A  ,B  ,C  )  e  A  then  there  exist  M  >  0  and  B  >  0  such  that 
c  c  c  — 

I  leAt|  I  <  Me"^,  t  >  0.  (3*6 

Since  the  value  of  J  is  independent  of  the  internal  realization  of  the 
compensator,  we  can  further  restrict  our  attention  to 

A  =  J(A  , B  , C  )  e  A:  (A  ,B  )  is  controllable  and  (C  ,A  )  is  observable 
+  (ccc~cc  cc 

The  following  lemma  is  required  for  the  statement  of  the  Main  Theorem. 

Lemma  3.1.  Suppose  $  €  B.(H.)  have  finite  rank  and  are  nonnegative 

definite.  Then  is  nonnegative  semisimple.  Furthermore,  if  pC&$)  =  nc  then 

n  n  xn 

c  c  c 

there  exist  G,T  e  j3(,H,  IR.  )  and  positive-semisimple  M  e  3R.  such  that 

AA 

QP  =  G*M  r,  (3.7 

r g*  =  i  • 

"c  (3.8 

Proof ♦  By  Lemma  2.4  there  exists  a  finite-dimensional  subspace  M.CH. 
such  that  £>MCM,,  6h'L  =  0,  ^MCM  and  '£mJ‘  =  0.  Hence  there  exists  an  orthonormal 
basis  for  H  with  respect  to  which  6  and  £  have  the  infinite-matrix  representations 


We  shall  refer  to  G,r  €  j5(H_,  1R  c)  and  positi ve-semisimple  M  e  IR  c  c 
satisfying  (3.7)  and  (3.8)  as  a  ( G,M,D-f actorization  of  $£.  For  convenience  in 
stating  the  Main  Theorem  define 

E=  br21b  ,  £  =  C  v21c. 

Main  Theorem.  Suppose  (Ac,Bc,Cc)  c  £+  solves  the  steady-state 
fixed-order  dynamic-compensation  problem.  Then  ther  ■  exist  nonnegative-definite 
Q/  P,  Or  $  €  B.^ (B, )  such  that  Ac,  Bc  and  Cc  are  given  by 

Ac  =  H A  -  Ql  -  LP)G*,  (3.9) 


=  TQc'v"1,  (3.10) 

C  2 

Cc  =  -R21B*PG*,  (3.11) 

for  some  (G,M, r  )-f actorization  of  “0$,  and  such  that  with  r  =  G  r  the  following 
conditions  are  satisfied: 


★  * 


Q:  J)(A  )  —  D(A),  P: 

D(A)  —  D(A  ), 

(3.12a, b) 

H  — *  D(A),  JH  —  D(A*), 

(3.13a,b) 

p(0)  =  p(£)  =  p($$)  = 

V 

(3.14a,b,c)* 

0  =  (A  -  tQL)Q  +  Q(A  - 

—  *  * 

TQL)  +  V1  +  TQXQT  , 

(3.15) 

* 

0  =  (A  -  ZPr)  P  +  P( A 

* 

-  XPr)  +  R 1  +  7  p^pr' 

(3.16) 

0  =  [ (A  -  XP)$  +  0(A  - 

*  * 

IP)  +  QIQJr  , 

(3.17) 

*A  A 

0  ■  [  (A  -  Q£)  P  +  P(  A 

-  QL)  +  PlPjr. 

(3.18) 

*(3.14a)  refers  to  p($) 


n  ,  etc. 
c 
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The  content  of  the  Main  Theorem  is  clearly  a  set  of  necessary 
conditions  which  characterize  the  optimal  steady-state  fixed-order  dynamic 
compensator  when  it  exists.  These  necessary  conditions  consist  of  a  system  of 
four  operator  equations  including  a  pair  of  modified  Riccati  equations  (3.15)  and 
(3.16)  and  a  pair  of  modified  Lyapunov  equations  (3.17)  and  (3.18).  The  salient 


feature  of  these  four  equations  is  the  coupling  by  the  operate  t  e  J3(IJ)  which/ 

.2 


because  of  (3.8),  is  idempotent,  i.e.,  r  =  r.  In  general,  r  is  an  oblique 
projection  and  not  an  orthogonal  projection  since  there  is  no  requirement  that  r 
be  selfadjoint.  Additional  features  of  the  Main  Theorem  will  be  discussed  in  the 
remainder  of  this  section.  For  convenience,  let  G,  M,  r,  r,  Q,  p,  $  and  $  be  as 


given  by  the  Main  Theorem  and  define  A  =  diag  (A,...,  X  )»  where 

1  nc 


Aj >  ...  >  An  >0  are  the  eigenvalues  of  M. 


n  xn 
c  c 

We  begin  by  noting  that  if  xc  is  replaced  by  Sxc,  where  S e  ]R 


is  invertible,  then  an  "equivalent*  compensator  is  obtained  with  (Ac,Bc»Cc) 


replaced  by  (SA  S_1,SB  ,C  S-1). 

c  c  c 


Proposition  3.1.  Let  (Ac,Bc,Cc)  *A+* 


n  xn 

If  SfRC  C  is 


invertible  then  (SA  S  *,SB  ,C  S  *)  €  A.  and 

c  c  c  + 


J ( A  ,B  ,C  )  =  J (SA  S_1 , SB  ,C  S  2). 
c  c  c  c  c  c 


(3.19) 


Proof .  Although  the  result  is  obvious  from  system-theoretic  arguments. 


we  shall  prove  it  analytically  by  utilizing  elements  of  the  development  in  Section  4. 
*-oo  0 


Def 


f1*  o' 

ine  S'  =  e  _B  (_H 

-°  s. 


)  and  note  that  replacing  ^AC'BC'CC^  by 


-1 


(SAcS  ,SBc,CcS  )  is  equivalent  to  replacing  A-,  V  and  R  by  SAS"  *,  'SVS  and  £>  ■‘•sr1. 


respectively.  If  M,/3  >  0  satisfy  (3.6)  then  a  straightforward  application  of  the 


Hille-Yosida  Theorem  ([57],  pp.  153-5)  shows  that  the  strongly  continuous 


semigroup  generated  by  "SAS  *  satisfies  I  |e 


SAS  t, 


I  <  I  IS  I  |  I  IS  A|  | Me  which 


-fit 


1 


I 


u 


ri 

c 


:>■ 


li 

3 


•n 


'  r  *  -  *  m  *  V  *  »  *  ■  *  ■  j-  ‘  1  -  '  *  ,  *  »  •  V  *  V  ••*»•*  *  •  "  .  *  V  *  •  *  jn  W  •  »  *  *  **  to  "  W  *  *  *  »  "  to  ‘  to*  .  *  .  *  »  *  .  •  to  *  V*"  \  *'  •  *  ,  •  J 


Vv-YV^V^. 
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proves  the  first  assertion.  Since  Se  S  ,  t  >  0,  is  also  a  strongly 
continuous  semigroup  with  generator  ?A£T  *,  it  follows  that  Se^lT1  =  e^ 
Hence 


sr!t 


oo 

/eSr'SsV?*>.,3lrl,*t 


dt  ■  SQS 


and  (3.19)  follows  from  tr  ~QR  -  tr  (SQ£r*)(£T  RS  ^ ) .  □ 

In  view  of  Proposition  3.1  one  would  expect  the  Main  Theorem  to  apply 
also  to  ( SA^S  1,SBc,CcS  ^).  Indeed,  it  may  be  noted  that  no  claim  was 
made  as  to  the  uniqueness  of  the  (G,M, r  )-factorization  of  used  to  determine 
Ac,  Bc  and  Cc  in  (3.9)— (3.11) .  These  observations  are  reconciled  by  the 

following  result  which  shows  that  a  transformation  of  the  compensator  state  basis 

/VS  -T  T  -1 

corresponds  to  the  alternative  factorization  QP  =  (S  G)  (SMS  )(S D  and, 
moreover,  that  all  (G,M, r  ) -factorizations  of  are  related  by  a  nonsingular 

transformation.  Note  that  r  remains  invariant  over  the  class  of  factorizations. 

n  xn  __  _ 

Proposition  3.2.  If  S  €  IR  c  c  is  invertible  then  G  =  S  G,  r  =  sr 

and  M  =  SMS  *  satisfy 

AA  — *— =■  ..  _ 


££  =  G  M  r, 


TG  =  I  . 
n 

c 


n  n  xn 

Conversely,  if  G ,r  €  B. (H. ,  IR  C)  and  invertible  >1  €  IR  C  satisfy  (3.7)'  and 

n  xn  _  _T  — 

(3.8)',  then  there  exists  invertible  S  €  ]R  c  c  such  that  G  =  S  G,  r  =  Sf  and 
M  -  SMS-1. 


Proof .  The  first  part  of  the  proposition  is  immediate.  The  second 
A  _ _1 _  *  -1  _2  _  *__2 

part  follows  by  taking  S  =  M  /'G  M  ,  noting  S  =  Kffl  and  using 
the  identities  Tg  MTG  *  M  and  MTG  =  TG  M.  □ 

The  next  result  shows  that  there  exists  a  similarity  transformation 

AA 

which  simultaneously  diagonalizes  QP  and  t. 

Proposition  3.3.  There  exists  invertible  0  e  j3(H.)  such  that 


ft.#-1  ** 


0  , 


0  , 


ft.#*  ** 


.-1  n 
T  =  0  C 


0,  ( 3 . 20a ,b ) 


0,  (3.21a,b) 


n  xn 
,  c  c 


where  A^,A£  e  IR  are  positive  diagonal  and  A0A£  =  A.  Consequently, 


$  *  $T. 


(3.22a,b) 


Proof.  Proceeding  as  in  the  proof  of  Lemma  3.1,  choose  an  orthonormal 

rti  oi  pj.  oi 

basis  for  P[  with  respect  to  which  $  =  and  £  *  I  ,  where 

Lo  oj  Lo  oj 

$!»  €  IRrXr  are  nonnegative  definite.  By  Theorem  6.2.5,  p.  123  of  [ 55 ] , 


t  A  L  — _ 

there  exists  invertible  tf’elR  such  that  A$  6  and  A£  =  iJT^  tfr* 

are  nonnegative  diagonal.  Because  of  (3.14),  it  is  clear  that  0"can  be  chosen  so 


so  that 


A^  =  °j  and  A^  =  ,  where  A^,  A^elRC  C  are  positive 

far  0  1 

.  Thus  (3.20)  holds  with  0=  I  .  From  (3.20)  it  follows  the 

fA  A  01  L°  X«J 


diagonal.  Thus  (3.20)  holds  with  0! 

rA  \  on  __ 

^  ^  10.  Now  define  G 


.  From  (3.20)  it  follows  that 


J0.  Now  define  G  =  ( O]0-*,  M  =  and  r  =  ( 1^  0)0 

c  c 


so  that  (3.7)'  and  (3.8)’  are  satisfied.  By  the  second  part  of  Proposition  3.2 


n  xn 
c  c 

there  exists  invertible  S  €  IR 


such  that  G  «  STG,  M  =  S_1MS  and  r -  S~1r. 


Since  M  and  M  have  the  same  eigenvalues,  M  *  A  (modulo  an  ordering  of  the  diagonal 
elements)  and  thus  (3.21a)  holds.  Finally,  (3.21b)  follows  from 


r 


G*r  =  G  *T  =  0_1 


□ 


Remark  3.1.  Proposition  3.3  shows  that  A-i , . . . ,  A  are  the  positive 

-  j.  n 

c 

eigenvalues  of  QP. 


Remark  3.2.  The  simultaneous  diagonalization  in  (3.20)  has  been 
effected  by  a  contragredient  transformation  ([55,58]).  For  applications  of  this 
type  of  transformation  to  model  reduction  and  realization  problems  see  [12, 
59-61].  Simultaneous  diagonalization  of  operators  is  discussed  in  [53],  p.  181. 


The  following  result  permits  the  precise  handling  of  the  unbounded 
operator  A  in  (3.9),  (3.17)  and  (3.18). 


Proposition  3.4.  The  following  relations  hold: 


p(G)  *  p(D  *  p(T)  =  n  ,  (3.23a,b,c) 

c 

*  ★ 

r:  H  —  D(A),  r  :  H.  — *  JD ( A  ),  (3.24a,b) 

£  n  ^  n  ^ 

G  :  1R  C  —  £(A),  r  :  IR  C  —  £(A  ).  (3.25a,b) 

* 

Proof.  From  (3.8)  and  (2.1)  it  follows  that  n  *  p(/*G  )  < 

-  c 

min  jp(D,  p(G  )^.  Since  p(T)  <  nc»  p(G)  ■  p(G  )  and  p(G)  <  n£,  (3.23a) 

and  (3.23b)  hold.  To  show  (3.23c)  either  note  (3.21b)  or  use  (3.14a)  and  (3.22) 

to  obtain 

n  =  p(6)  =  p(r$)  <  P(r)  =  p(G*D  <  pin  -  n  . 
c  c 

To  prove  (3.24a)  note  that  (3.22a)  implies  R,(£)  C£.(r)  and  thus  p($)  *  p(r) 
implies  £(£)  =  £.(r),  and  similarly  for  (3.24b).  Finally,  (3.25)  follows  from 
(3.23),  (3.24),  the  definition  t  *  G  f  and  the  fact  that  t  *  T  G.  □ 
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Since  the  domain  of  A  may  not  be  all  of  H.,  expressions  involving  A 


require  special  interpretation.  First  note  that  because  of  the  range  condition 

(3.25a),  the  expression  (3.9)  indeed  represents  an  nc  x  nc  matrix  (see,  e.g.» 

T 

[6],  p.  80).  Similarly,  because  of  (3.25b),  Ac  is  given  by 

AT  =  G(A*  -  IQ  -  P L  )  T  .  (3.26) 

c 

With  regard  to  (3.15),  note  that  because  of  (3.12a),  the  right-hand  side  of  (3.15) 

*  A  —  _  * 

is  a  linear  operator  with  domain  D,(A  ).  Since  0  =  -  rQIQ  -  QXQ  r  +vl  + 

_  *  *  * 

rQI  Qr  is  continuous  on  J)(A  ) ,  AQ  +  QA  has  a  continuous  extension  on  H 

given  precisely  by  -0  .  Similar  remarks  apply  to  (3.16).  Analogous  domain 
conditions  were  obtained  in  (5]  for  a  deterministic  infinite-dimensional 
linear-quadratic  control  problem  with  full~state  feedback.  Finally,  because  of 
(3.24)  the  right-hand  sides  of  (3.17)  and  (3.18)  denote  bounded  linear  operators 
on  all  of  H,. 

It  is  useful  to  present  an  alternative  form  of  the  optimal  projection 
equations  (3.15)— (3.18) .  For  convenience  define  the  notation 

A  y 

T  -  I  -  T 
1  H 

Proposition  3.5.  Equations  (3. 15 )-( 3. 18 )  are  equivalent,  respectively, 
to 


* 

0  =  AQ  +  QA  +  V2 


_  * 

-  Q  IQ  +  rQIQr,.  , 


(3.27) 


*  * 

0  =  A  P  +  PA  +  -  PIP  +  l^P IPr1  ,  (3.28) 

0  =  (A  -  IP  )u  +  Q(  A  -  IP)  +  QIQ  -  T^Q  I Q  T  x  ,  (3.29) 

*A  A  * 

0  *  (A  -  QI  )  P  +  P(A  -  QI  )  +  PIP  -  T^PIPT^.  (3.30) 
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Prc  .  The  equivalence  of  (3.27)  and  (3.28)  to  (3.15)  and  (3.16)  is 

A  A  *  * 

immediate.  Using  (3.22a)  in  the  form  q  *  y  t  ,  we  obtain  (3.17)  =  (3.29)  r  . 
Conversely,  from  (3.22a)  and  [(A  -  £P)$]  =  §(A  -  XP)  (see,  e.g.,  [6],  p. 

80)  it  follows  that  (3.29)  =  (3.17)  +  (3.17)*  -  r(3.17).  Similarly,  (3.18)  and 
(3.30)  are  equivalent.  □ 

The  form  of  the  optimal  projection  equations  (3.27)— (3.30)  helps 

demonstrate  the  relationship  between  the  Main  Theorem  and  the  classical  LQG  result 

when  dim  jl  =  n  < oo.  in  this  case  we  need  only  note  that  the  (G,M,D-factorization 

of  ^  in  the  "full-order*  case  n  =  n  is  given  by  G  =  T  =  I  and  M  = 

c  n 

Since  T  =  1^,  and  thus  =  0,  (3.27)  and  (3.28)  reduce  to  the  standard 

observer  and  regulator  Riccati  equations  and  (3.9)— (3.11)  yield  the  usual  LQG 
expressions.  Furthermore,  note  that  in  the  full-order  case 

Ac  =  A  +  BCc  -  BcC  (3.31) 

and  (3.29)  and  (3.31)  can  be  written  as 

O  =  (A  +  B  C)§  +  §(A„  +  B  C)T  +  B  V_B*  ,  (3.32) 

c  c  c  c  c  ^  c 

O  =  (A  -  BC  )T£  +  $(A  -  BC  )  +  CTR_C  .  (3.33) 

CC  C  C  C  2  c 

Since,  as  is  well  known,  the  stability  of  aT  corresponds  to  the  stability  of 

A  +  BC  *  A  +  B  C  and  A-BC=A  -  BC  ,  it  follows  from  standard 
c  c  c  c  c  c 

results  (e.g.,  162],  pp.  48,  277)  that  the  positive-definiteness  conditions 
(3.14a,b)  are  equivalent  to  the  assumption  that  (Ac,Bc,Cc)  is  controllable 
and  observable. 

To  obtain  a  geometric  interpretation  of  the  optimal  projection  we 
introduce  the  quasi-full-state  estimate 

x(t)  *  G  x  ( t )  e  H 


so  that  r£(t)  =  $(t)  and  xc ( t )  =  r${  t).  Now,  the  closed-loop  system 
(3.1)— (3.4)  can  be  written  as 


:(t)  =  Ax ( t )  -  BC  T$(t)  +  H.w(t), 
c  1 


(3.34) 


$(t)  =  T(A  +  BC  -  £  C)  T$(t)  +  T&  (Cx(t)  +  H  w(  t ) ) , 
C  c  C  2 


(3.35) 


where  (3.35)  is  interpreted  in  the  sense  of  (3.34)  since  x(t)  e  H.  and  where 

£  ft  QcV1,  £  ft  -R~^B*P. 

C  2  c  2 

It  can  thus  be  seen  that  the  geometric  structure  of  the  quasi-full-order 
compensator  is  entirely  dictated  by  the  projection  r.  In  particular,  control 
inputs  rx(t)  determined  by  (3.35)  are  contained  in  ,R(  r )  and  sensor  inputs 

rBcy(t)  are  annihilated  unless  they  are  contained  in  [,N(  r  )  JA  =  R It  ). 

* 

Consequently,  R,(  r)  and  R(  t  )  are  the  control  and  observation  subspaces, 
respectively,  of  the  compensator.  Since  r  is  not  necessarily  an  orthogonal 
projection,  these  (finite-dimensional)  subspaces  may  be  different. 

From  the  form  of  (3.35)  it  is  tempting  to  suggest  that  the  optimal 
fixed-order  dynamic  compensator  can  be  obtained  by  projecting  the  full-order 
(infinite-dimensional)  LQG  compensator.  However,  this  is  generally  impossible  for 
the  following  simple  reason.  Although  the  expressions  for  Ac,  Bc  and  Cc  in 
(3.9)— (3.11)  have  the  form  of  a  projection  of  the  full-order  LQG  compensator,  the 
operators  Q  and  P  in  (3.9)— (3.11)  are  not  the  solutions  of  the  usual  LQG  Riccati 
equations  but  instead  must  be  obtained  by  simultaneously  solving  all  four  coupled 
equations  (3.15)— (3.18) .  This  observation  reinforces  the  statement  made  in 
Section  1  that  the  optimal  fixed-order  dynamic  compensator  cannot  in  general  be 


obtained  by  LQG  followed  by  closed-loop  controller  reduction  as  in  [14]  and  [15], 


r  ri 


■V  V,  V  tTI  ^  V  "V1 


We  now  give  an  explicit  characterization  of  the  optimal  projection  in 

terms  of  §  and  $.  Since  $$  has  finite  rank,  its  Drazin  inverse  <$>D  exists 

AA  2  *2 

(see  Theorem  6,  p.  108  of  [63])  and,  since  (QP)  =  G  M  r  ,  and  hence 
p($£)2  =  pf^),  the  "index"  of  (see  [63,64])  is  1.  In  this  case  the  Drazin 
inverse  is  traditionally  called  the  group  inverse  and  is  denoted  by  (6$)*  (see, 
e.g.,  [64],  p.  124  or  [65]). 

Proposition  3.6.  The  optimal  projection  r  is  given  by 


T  =  QP($£)#. 


(3.36) 


Proof.  It  is  easy  to  verify  that  the  conditions  characterizing  the 
Drazin  inverse  ([63])  for  the  case  that  has  index  1  are  satisfied  by 
G  M  * T  .  Hence  (^)#  -  cVV  and  (3.8)  implies  (3.36).  □ 

We  now  give  an  alternative  characterization  of  the  optimal  projection 
by  introducing  the  following  notation  from  [51],  p.  73.  For  0,  0  e  Ji  define  the 
operator  0®0€E.(J.)  by 

(<f>  ®  0)x  *  <x,0>  0,  x  €  H, 

and  note  that  p((f>  ®  0)  *  1  if  0  and  < p  are  both  nonzero  and  (ef>  ®  t //)  =  0  ®  0. 
Using  this  notation,  (3.21a)  can  be  written  as 


0^0  ~1 


E  A.f  ®  f  , 

i*i  *i' 

1*1 


(3.37) 


where 


is  an  orthonormal  basis  for  H,.  In  terms  of  the  Riesz  bases 


(see  e.g.,  [52],  p.  309) 


0A  =  0*£,  *  0~1£i,  i  *  1,2,..., 


(3.37)  is  equivalent  to 


A  A  ,V,\ 


AV-V-'.M 


V  Ii>  *  "j,  .  "  .  *>  *.  '  *  ' 


1  ’*  .  m  \  *  *  •  *  »  "*  •  *  «  '  ’ 

•  *>  •>  v-  . ■ 


■JEcL' 


(3.38) 


OP 


2-  \  ®  V 

i=l 


which  can  be  regarded  as  a  specialized  spectral  decomposition  of  a  semisimple 
operator.  We  emphasize  that,  in  contrast  to  the  singular  value  decomposition  for 


compact  nonnormal  operators  (see,  e.g.,  [50],  p.  261),  the  A.^  in  (3.38)  are 


AA 


eigenvalues  of  QP  (see  Remark  3.1),  not  singular  values.  Moveover,  although 
|  i=1  and  ji-i  are  bases  for  H.,  they  are  not  necessarily 


orthogonal.  They  are,  however,  biorthonormal,  i.e.,  ^  >  ■  ^ij'  and 


hence  <f>.  ®  ip.  is  a  rank-one  projection  and  (  d>.  ®  if).  )  {  <j> ,  ®  if). )  =  0, 
i  i  iij  j 


i  #  j.  Since  r  is  a  rank  nc  projection,  it  is  not  surprising  that  r  is  given 


precisely  by 


n 


„c 

i=l 


t  =  </,.  ®  if;.. 


(3.39) 


The  following  result  formalizes  the  above  observations. 

Proposition  3.7.  There  exist  biorthonormal  linearly  independent  sets 

{"A.  }  C  C  D(A)  and  {<£.1  °  C  D(A*)  such  that  (3.38)  and  (3.39)  hold  . 

{  X,i*l  1  i;i=l 

Furthermore,  if  the  (G,M,T)-f actorization  of  ££  is  chosen  such  that  M  =  A  ,  then, 
for  all  x  c  ii, 

Gx  =  Kx,  */^ >,...,<  x,  ^  ^  )  , 

c 

T 

rx  =  K x ,  (})^> ,...,^x,  ^  ^  )  . 
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and. 


Remark  3.3.  Note  that  PQ  and  r*  are 

$$  -  £C  A.  ®  4>A, 

i=l 

t*  -  y:c 

i  =  l 


for  all  y  =  (y^ 


)  €  R 


* 

G  and 


* 

G  y 


given  by 


* 

r  satisfy 
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4.  PROOF  OF  THE  MAIN  1HEOREM 


We  state  and  prove  a  series  of  lemmas  which  allow  us  to  compute  the 


Frechet  derivatives  of  J  with  respect  to  A  ,  B  and  C  .  Requiring  that 

c  c  c 


these  derivatives  vanish  leads  to  the  necessary  conditions  in  their  "primitive" 
form.  A  transformation  of  variables  then  leads  to  the  form  of  the  necessary 
conditions  (3-9)— (3.18) . 

Let  "u-lim"  denote  the  uniform  limit  (i.e.,  limit  in  operator  norm) 


for  bounded  linear  operators  ([50],  p.  150)  and,  for  strongly  continuous  S(t)€ 

t. 


B(H),  t  >  0,  interpret  the  strong  integral 


t 


I'ral  f  S(t 


)  dt  according  to 


2  '"I 

S(t)z  dt,  z  «  H  ([50],  p.  152).  Also  recall  the  standard  fact  ([6],  p.  186) 


At  *  a  t 

that  (e  )  =  e  and  similarly  for  A.  Throughout  this  section  let 


(Ac,Bc,Cc)  e  A+  and  let  M,j3  >  0  satisfy  (3.6). 

To  begin,  note  that  the  closed-loop  system  (3.1)— (3.4)  can  be  written  a 


x(t)  =  A)T(t)  +  ”hw ( t ) , 


(4.1) 


where 


H  S 


BcH2 


€  B2(H"  ®  IR4  )  , 


For  convenience  define  the  nonnegative-definite  operator 


V  =  HH* 


T 

B  V_B  . 
c  2  cj 


€  ^(H). 


In  terms  of  the  augmented  state  )T(t),  the  performance  criterion  (3.5)  becomes 


J(A  ,B  ,C  )  =  lim  E<  R>T(  t )  j’xt  t )  >  , 


(4.2) 


•  'A 


where  the  nonnegative-definite  operator  R  is  defined  by 


T 

C  R_C 
c  2  c 


€  ^(H) 


To  write  (4.2)  in  terms  of  the  covariance  of  lT(t),  recall  ([6],  p. 

* 

that  the  covariance  * E[  (  £  -e£)  (  £  -  E£)  ]’  of  a  Hilbert-space-valued  weak 
random  variable  £  is  defined  to  be  the  nonnegative-definite  operator  S  which 
satisfies 


<Sy,z>  =  E<£-  E£,y><£-  E£,z> 


for  all  y,z  in  the  Hilbert  space.  Hence  define  ([6]),  p.  317) 

Q(t)  =  E[(x(t)  -  E?(t))(7(t)  -  ETT(t))*]. 

Lemma  4.1.  Q  =  u-lim  ’Q(t)  exists  and  is  given  by 
t-» « 

Q  =  y*eAtVeA  fcdt. 

0 


Furthermore, 

J(Ac,Bc,Cc)  =  tr  QR. 

Proof .  First  compute  (as  in  [6],  p.  317) 


_  At  At 

<  Q(  t  )y ,  "z  >  =  E<x(t)  -  e  Ex"(0 )  ,y><x(t )  -  e  Ex(0),T> 


=  E<  /  e 


I' 


A(t-s)- 


A(t-cr)-> 


_ *  . —  * 

A  t-^_  At 


HvT(s)ds,y'X  /  e  Hw(cr)dfr,z'>  +  <  Q(0  )e  y“,e 


t-t- 

nr  I  i  77*  A  (t-s)~,  .  77*  A  ( t  —  <x) —  ,  .  ,  .  At~.  -  ,  A  t-_ 

E  J  J< w(s)  ,He  y><w(cr),He  z>dsdcr+<e  Q(0)e  y, 


h  __ 

J.  A(  t-s  )~  A  (  t-s) - -  ,  .  At--  \  A  t-r  ~  -V 

<e  Ve  y,z>ds  +  <e  Q(0)e  y,z  >, 


308) 


(4.3) 


(4.4) 


z> 


z  > 
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which  shows  that  Q(t)  is  given  by 


Q(t)  =  eAtQ(0)eA  t  +  f  eAsVeA  Sds. 

*6 


Clearly,  (4.3)  makes  sense  as  a  strong  integral  since 


| |eAtVeA  1 Idt 
u 


2  ^  °°/r 

—  M  I  IV  |  |  J 


To  demonstrate  uniform  convergence  it  need  only  be  noted  that 

IIQ-Q(t)||=  sup  | | (Q  -  Q(t) )y I  I 
I  lyl 1=1 


oo  _ _  /-w 

r  As—  A  s^  „  At-^..  ,  A  t__. 

sup  ||  /e  Ve  y  ds  -  e  Q(0)e  yl  I 

lyl  I  =1 


|eAs\TeA  S  lids  +  I  leAtS'(0)eA  fc| 


<  7  M2|  |V|  |  /8_1e"2/3t  +  I  IQ(0) 


Next,  let  be  an  orthonormal  basis  for  IT  and  use  Parseval’s  equality  to 


obtain 


(A  , B  ,C  )  =  lim  Ell  R1/25T(t)l|2 


c'  c'  c 


t-»  00 


lim  E^<R1/Zir(t),f>2. 
t  — *  oo  i  =  l  1 


\  /  2  2 

Since  f  (t)  =  J]  <R  x(t),  <f>.>  ,  t>0,  is  nonnegative  for  each  n  and  is 

n  i=l  1 
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increasing  in  n  for  each  t  with  limit  <Rx'(t )  /x(t  )>  ,  monotone  convergence  permits 

^  *At  — 

expectation-limit  interchange.  Hence  using  E"x(t)  =  e  E”x(0)  we  have 

J  ( A  ,  B  ,  C  )  =  lim  J2  E<x'(t),R1/2^.>2 
C  C  t— *»  i=l  ‘  1 

=  lim  f)  «^(t)R1/2^i,R1/2</)i>  +  <eAtIEx(0),lr1/2^i>2] 

t— 00  i  =  l  "  1 


lim  < 

t  “ ♦«  ' 


tr[RL/2^(t)R1/2j  +  |  |R1/2eAtEx-(0)  I  I2| 


which  by  Corollary  2.1  yields  (4.4).  □ 


We  shall  also  require  the  "dual"  of  Q"  given  by 


P  =  f eA  fcReAtdt. 


(4.5) 


Since  V  ana  I?  are  nonnegative  definite  it  is  readily  seen  that  Q  and  P-  are  also 
nonnegative  definite. 

Lemma  4.2.  o', P-  €  (|T) . 

Proof.  It  suffices  to  consider  Q  only  since  the  situation  for  P-  is 
exactly  analogous.  Since  o'  is  nonnegative  definite.  Lemma  2.3  can  be  used. 
Letting  <</>.  V  be  an  orthonormal  basis  for  if,  we  have 

tr  Q  =  <^,  <£> 

00  00. 

£</ 

i=l 

oo.  n 

n— »oo  0  i  =  l 


Z)  <  f  eAtVeA  <t>.> 

i=l 

oo-  n  _ *  — * 

/V  .-At,  At, 

lim  J  2-»  <  Ve  9^,e  4>^>  dt. 
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Let  fn<t)  denote  the  above  integrand.  Since  'v  is  nonnegative  definite/ 


is  a  monotonically  increasing  sequence  of  nonnegative  functions  such 


Atr-  A  t 


that  f  (t)  — •  tr  e  Ve  ,  t  >  0.  Hence,  by  monotone  convergence  and 
n 


Lemma  2.2, 


QO 

tr  Q  =  f  tr[eAtVeA  ] d t 
•'o 


=  J  I l[eAtVeA  fc|  l1dt 


<  M2!  |V |  I,  f  e"2'Stdt 


<  00  .  □ 


Lemma  4.3.  With  <2  and  ?  given  by  (4.3)  and  (4.5)  it  follows  that 


tr  QR  =  tr  VP. 


(4.6) 


is  {^i}i=l 


Proof.  For  any  orthonormal  basis  of  _H  we  have 


tr  QR  =  tr  RQ 


«  00„  ~ 


E  eAtVeA  t(f). dt,  <f>.> 

J  1  1 


i=l  0 


«>  n 


=  lim  f  E  <  R'eAtVeA  4>.>dt.- 


n— «oo  0  i=l 


_ ^ 

Letting  f  (t)  denote  the  above  integrand  it  follows  that  f  (t)  -♦  trie  Ve  , 
n  n 


t  >0,  and 


oo  _ 

r  ,  ^  At~  a  t ,  ~ 


If  (t )  I  <  E  I  <eAtVeA  t<f>i.,rR(f)i\ 
i=l 

<  M2|  |V |  le-2^  E  I  IR<fe  I  I* 
i=l 


I 
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5 


K-: 


f. 

* 


i 


*y 


>■ 


L 


*  IV"  ■  '^T-’T'V 


^  ^  "’V  TT  "-M  Tf 


p  v  vj  rr p v  w  w  w  w  tj  *  7  .ra  "tttjttt  ■.  u  ■  r*  *n  u  k.t  rj*  1 


^  ■■  w;., 


is  chosen  to  be  the  set  of  orthonormal  eigenvectors  of  R  then  Lemma 


2.1  implies  T"!  I  I R<^  I  I  =  ||R||^  and  thus  I f n ( t )  |  is  bounded  on  [0,oo)  by  an 
i=l 

integrable  function.  Hence  by  dominated  convergence. 


-w* 


~~  r  ^  AU  A  t 
tr  QR  =  /  tr[Re  Ve  ]  dt 


OO 


-/ 


— * 

A  t~_  At~ 
tr[e  Re  V]  dt 


oo_  eo 


~Jk  _ _ 


=  C  <V(j>.,eA  tReAt  >  dt. 
JQ  i=l  1 


And  again  using  dominated  convergence, 

_ 


00  00 


tr  QR"  =  22  f  <  V<f>.  ,eA  tReAt</>.  >  dt 
i=l  */0  1  1 

00  OOy.  -W*  - - 

*  ^2  eA  tReAt^>,  dt  > 


i=l 

=  t  r  VP .  □ 


The  next  result  is  important  in  that  it  allows  us  to  treat  Q  and  P  as 
solutions  of  dual  algebraic  Lyapunov  equations.  For  a  similar  result  involving 
groups  rather  than  semigroups  see  [50],  pp.  555-557. 


Lemma  4.4.  Q  is  given  by  (4.3)  if  and  only  if  Q  «  JB(lf)  satisfies 
Q:  D(A*)  —  D(A),  (4.7) 


0  =  AQ  +  QA*  +  V, 


(4.8) 


where  (4.8)  holds  in  the  sense  discussed  in  Section  3.  Furthermore,  P  is  given  by 
(4.5)  if  and  only  if  P  e  j3(]T)  satisfies 


P:  D(A)  —  D(A  ), 
0 


_ _  _ 

A  P  +  PA  +  R. 


(4.9) 

(4.10) 
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F 

$ 


4 


••o 


7? 


>5 

s 

77 


v. 


»tp 

.  V% 


P r oof .  We  consider  Q  only.  To  prove  necessity  let  t'  >  0.  Then  for 
__  _* 

all  t  e  [  0 ,  t ' )  and  x  c  D(A  )  we  can  write 


—  — *  oo..  —  — * 

At—  At'—  /  A(t+s)~A  (t'+s)—  , 
e  Qe  x  =  I  e  Ve  x  ds 


oo_  — . 

/  Ac*-  A  O’  A  (t '-t )—  , 

/  e  Ve  e  x  dcr. 


Hence, 


oo,  —  — *  — * 


d  At~A  t f  Ao~  A(T  A  (t ' -t  )— *—  J  At-  A  t . 

—  e  Oe  x  =  -  J  e  Ve  e  A  x  dcr  -  e  Ve  x,  (4.11) 


_ 

A  t 1 

which  shows  that  e  Qe  is  strongly  differentiable  with  respect  to  t  for 

At'  _ 

all  t  e  1 0 , t  * ) .  In  particular,  setting  t  =  0  it  follows  that  Qe  IT  f  I>(A)  for 

___  _jk 

all  x  e  J)(A  )  (see,  e.g.,  [6],  p.  173,  or  [50],  p.  485).  Performing  the 
differentiation  on  the  left-hand  side  of  (4.11)  and  setting  t  =  0  yields 


•  oo„  —  — *  — *  — * 

t C  Act--  A  <t  A  t .  —  A  t 

x  =  -  J  e  Ve  e  A*x  dcr  -  Ve  x. 


(4.12) 


fix  x  €  _D(A  ).  Then  for  w:  1,  t^  >  0,  t^  -»  0,  we  have 


_ _ * 

-  A  • 

Qe  ^jf  €  J>(T),  i  =  1,2,3,..., 


Qe  x 


A  ^"i—  )  °° 

Now  consider  the  sequence < AQe  x  Letting  t'*t^  in  (4.12)  and  using 

dominated  convergence  to  interchange  limit  and  integration  (A  "x  is  a  fixed  element 
of  7T) ,  it  follows  that 


lim  A^e 
i— »co 


(4.13) 


Since  A  is 


which  with 


and  hence 


as  desired 


hence 


Thus 


Extending 


i~.  C  Ao~  A 

x  =  -  j  e  Ve  Ax  do*  -  Vx . 


closed,  €  J)(A).  This  proves  (4.7).  Also,  since  X  is  closed  we  have 


, _ * 

A  t. 

lim  AQe  "x  =  "a^jT, 
i— •  oo 


(4.13)  implies 

AS?  =  -qX"?  -  VX, 


-  * 

(AQ  +  §A  +  V)x  =  0,  X  €  D(A  ), 


To  prove  sufficiency  let  x-  e_D(A). 


_ ★ 

Then  eA  tx"  e  D(A  ), 


t  >  0,  and 


d 

dt 


At~  A  t 
e  Qe 


eAt (AQ 


(JA  )( 


t-. 

X. 


— 

At-.  At-. 
e  Qe  x  -  Qx 


/*eAS(  AQ  +  (Ja  )e^  S?  ds, 
0 


* 

X  (  D ( A  ). 


AQ  +  QA  to  all  of  H  we  obtain 


eAtS"eA  -  QX"  =  -  J'  eAsVeA  s5?  ds,  "x  e  IT. 


Letting  t  -*  oo  yields  (4.3).  □ 


We  now  introduce  some  notation  which  will  prove  to  be  most  convenient 


n  xn  n  xf  mxn 

in  the  following  results.  For  (A^,B^,C^)  tIRC  xRC  xIR  define 


C 

c 


and 


1  l(8A  ,6B  '6C  }|  1  =  1  ISA  11  +  1  1  SB  11  +  1  l8c  1  *• 
c  c  c  c  c  c 


Furthermore#  let  A',  V'  and  R*  denote  A#  V  and  R  with  (Ac,Bc,Cc)  replaced 
by  (A^/B^C^)  and  define 


I 


8^  =  v'  -  v 


o 

0 


B  V  ,5* 
c  2  B, 


T 

SB  Vc 
c 


SB  V2SB 
c  c 


5*  *  R'  -  R 


rrt 


% 

1 


We  shall  also  write  Q',  P'  for  Q,  P  as  given  by  (4.3)  and  (4.5)  with  A,  V,  R 
replaced  by  A',  V-' ,  R'  and  define 


8^  =  Q'  -  Q,  5~  =  ?'  -  P . 


Lemma  4.5.  .A  is  open. 


Proof »  Let  (Ac,Bc,Cc)  e  k  be  arbitrary  and  consider  the  open  set 


"  -  {  <  b. 


n  xn  n  xf  mxn„ 

.  C  x  E  x  R  : 


(4.14) 


M(6A  »8b  ,8c  )  t  l</3  /2My  |  , 


c  c  c 


where  y  =  max  jl, | |B | | ,  |  |C I  1 1.  Then,  since  X*  =  X  +  8~-  and  8^-  e  B(X)  it  follows 
from  Theorem  2.1,  p.  497  of  [50],  that  for  all  (A^,B^,C^)  e  N  and  t  >  0, 


<  Me(-£+  MN  «AM)t 


<  Me 


Hence,  N  C  A,  as  desired.  □ 


-  . 

.1  -.VO  ■ ' 


-'v 


Lemma  4.6.  There  exists  c  >  0  such  that 


MWI  ^  cl  l<SA  ,8b  ,8c  )  I  I , 

J  c  c  c 


I  l%l  I  <  cl  |(SA  ,8fi  ,8c  )  I  I, 


c  c  c 


for  all  ( A ' ,  B ' ,  C ' )  e  N,  where  N  C  A  is  the  open  neighborhood  of  (A  ,B  ,C  ) 
c  c  c  c  c  c 


defined  by  (4.14). 

Proof .  We  consider  (4.15)  only.  Since  lie 


A't  “ft 

I  I  <  Me^  ,t  >0, 


*Ac'Bc'Cc^  €  N'  it:  follows  that 


00 

■o"H 


8~ll<j(;  lle^V'e*'  1  -  eAtVeA  fc||  dt 


<  f  |  I  le^^l  I  I  IV'  |  I  I  leA '  fc-  eA  V 


+  I  I eA ' |  |  |  |8~l  I  I  leA  fc| 


A'fc  At"  . 


+  I le  -e||  II V ||  | |e  I  I  >  dt 


(4.15) 


(4.16) 


<  M(  |  |V|  |  +  I  18^-1 1 ) 


■  (A  +S~)t  ~t 

•  ■  A  A  t  .  .  2  ,, 

e  -  e  e  dt 


2  /  2 
+  M  |  18^-1  I /  e  dt 


»  (A+Sj-)t  ~  “^t 

+  Ml  IV I l/l le  -  e  Ll  \ez  dt 


_  °°r  (A+S~-)t  ~  2 

M ( 2  |  1 V  |  I  +  |  I8ts-I  I  )J\  le  A  -  eAt  I  le  dt  +  #II8»I|. 

v  "'q  Jp  v 


(4.17) 


(  a+6^-)  t 

From  [50],  p.  497,  it  follows  that  the  perturbed  semigroup  e  has  an 


expansion 


(X+Vfc  At  £ 

e  =  e  +  L,  (t),  t  >  0, 

i-1 


where  Ui ( t )  (  B(|),  t  >  0,  satisfy  the  estimates 


I  10.  (t)  I  I  <  M1  +  1|  IS-r-l  iS^tVi! 

1  A 


Hence,  for  all  (A',B',c')  e  N, 
c  c  c 


( A+  5y.)  t  ~  SO 

le  -  e  I  I  <  £  I IU. (t ) |  1 

i«l 


-fit  M|l8XMt 
<  Me  ”ie  A  -  1], 


(4.18) 


From  (4.17),  (4.18)  and  the  relations  |  |  S5-I  I  <711(8,  ,8D  ,8  )||  <  (3/  2M  and 

A  A  o  C 

c  c  c 


?[■ 


mi  isti  it 

A 


-  1 


e  2  '  ,8,  ,8C  .11 

3/3  c  c  c 


it  follows  that 


2m  y  — 

1 18^1 1  <  — ~  (21 IV 1 1  +  |  18^-1 1)1 1  <8a  ,8 
w  3/3  c 


sc  )  I  I 

c 


2M 

3/3 


(21  IBcv2I  I  I  I8d  II  +  I  IVJ  I  |  I8n  II  ), 


B 


'B 


which  yields  (4.15).  □ 


Since  Q,  P  €  ji(.H)  we  can  write 


_Q,  Q  1 

r  p  p,  n 

1  12 

1  12 

* 

,  P  « 

* 

Q  Q 

L  12  u2  J 

_ P12  P2  _ 

nc  ncxnc 

where  €  j»(H.),  Q^2  t  Et(  IR  ,  ) ,  Q2  c  IR.  and  similarly  for  P^,  P^2  an 

P2»  Note  that  ,  Q2,  P^  and  P2  are  nonnegative  definite.  Also,  define  the 


notation 


and,  for 

Uc'Bc'Cc)  *  i'  let 

SJ1SA  'SB  -*C  ’  ^  1  -  J1»C'BC'CC>- 

c  c  c 

Lemma  4.7.  Let  (A'.B'.CMf  A.  Then 
c  c  c 

where 

5j(5a  '6b  '5c  }  "  ±{8A  '5b  'Sc  >  +  olll(SA  ,8b  ,Sc  )M), 
ccc  ccc  ccc 

(4.19) 

L(5a  ,8b  ,5c  )  =  2tr[Z25A  ]  +  2tr((V2B^P2  +  CZ*  )$B  ] 

ccc  c  c 

(4.20) 

and 

+  2trlQ,cV  +  Z*  B)8.  ] 

2  C  2  12  C 

C 

lim  ||(S.  ,5n  ,S_  )  |  |-1o(  IMS.  ,5_  ,S„  )ll)  =  0. 

(4.21) 

Proof .  Combining  (4.8)  and  (4.10)  with  (4.6),  J  can  be  written  as 
J(Ac,Bc,Cc)  =  tr  [  C>R  +  PV]  +  7tr[Qcf(A*P  +  PA)  +  Pc/fAQ  +  QA* )], 

and  likewise  for  (A',B',C'),  where  "c/*  denotes  closure  (i.e.,  extension)  of 

c  c  c 

bounded  operator  to  all  of  IT*  Now  using  the  identity 

trtQ'^'  +  P'V*  ]  -  tr(QR  + 'PV]  =  tr[Q5g-  +  ?S~]  +  tr[S^R'  +  S^V * ) 
we  can  compute 

SJ<SA  ,5B  'V  =  tr[^SR  + 
c  c  c 

+  TtrlOci'fA  (P+5~)  +  (P+Sp-JA'))  +  jtrlS^c/fA'V'+P'A')] 

+  7tr[Pcf(A' (Q+5~)  +  (Q+S~) A 1  * )  ]  +  -itr[8^cf(A,Q',+Q'A'*)] 

-  itrl&cfuV+pff)  +  PctOiQ+fik* ))  +  tr[S-R'  +  5pV'  ] . 

Using  A'  =  A+8^- and  combining  the  second,  fourth  and  sixth  terms  yields 

V6*  -6b  -5C  1  ■  '  +n  • 

C  C  C 

where 

A  =  tr[Q8~+  P8~]  ~  +  7tr[Q(8~P+PS~)  +  P'(5-rQ+Q§I)  ] 

K  V  A  A  A  A 

=  trlQ&fr+PS-]  +  2tr  [8-JqP  ] 


n  =  -jtr  [Qcf(  A'  S^+SgA')  +  )] 


+  -jtrlS^cfU'V'+P'A'  )  +  Sg-c/U'Q'+Q'A1*)  ] 


+  tr[&-R'  +  Sg-V'  ] . 


Computing 


tr[Q5g-  +  PS-] 


2tr[V2B^2SB  ]  *  2tr[02c^28c  1 

c  c 


’  trlP2«B  V25B  +  °28C  *2SC  1 
c  c  c  c 


2tr[5^QP]  =  2tr[Z28A  ]  +  2tr[CZ218B  ]  +  2tr[Z*2BSc  ] 

c  c  c 


and  retaining  first-order  terms,  we  obtain  (4.20). 


To  evaluate  ft,  use  (4.8)  and  (4.10)  to  replace  R'  and  V1  in  the  last  term 


in  ft  and  write  X'  =  A+8x,  to  obtain 


ft  =  7tr[Qcf(A  8~+SpA)  +  Pcf( a8-~+8--A  )] 


♦  iwii'WW  + 


(4.22) 


-  7tr[8^cf(A'  P’+P'A'  )  +  S^cfU'Q'+Q'A'  )]. 


Next  we  note  that 


tr [Qcf  (A  8~  +  <wa*  )  ]  =  tr[8^c/(AQ+QA*  )  ] 


(4.23) 


To  see  this  we  observe  that  by  arguments  similar  to  those  used  in  the  proof  of  Lemma 
4.4  and  the  fact  that  5^:  D(X)  — »  D(A  )  it  follows  that 

t  ~  — 

S?  =  -  y  eA  tci,(A5?+S?A)eAt  dt. 

Now,  using  the  technique  of  Lemma  4.3  with  the  role  of  R"  played  by 
-c /( A  5 |y+5|-A ) ,  we  see  that 

tr[Qc/(A*5?.+5?A))  =  -  trlS^V]  =  t r  [8^/(AQ+$A* )  ] . 

Similarly,  it  can  be  shown  that 

*  * 

tr[Pc/(ASg.+8^A  )]  =  trl8^c/(A  P+PA)].  (4.24) 

Now  substitute  (4.23)  and  (4.24)  into  (4.22)  and  rearrange  the  second  term  in 
(4.22)  so  that 

a  =  7tr[5^c-f(X*P+PA)  +  5^cf(AQ+$A* )) 

+  -j-t  r  [St^S-^p+pS-jJ  +  8-US-s 0+Q&-) ) 

2  Q  A  A  PA  A 

-  -trtS^c^A'V'+P’A' )  +  S^c-fU'Q’+Q'A'*)] 

=  -7tr[87rcf (X'*5~-+&sA'  )  +  S-cftA'S-.+O'’*)]. 

2  Q  P  P  P  Q  Q 

Using  (4.8)  to  obtain 

o  =  a -8*.  -  3^x-*  +  +  q£  +  87T 


and  (4.10)  to  obtain  a  similar  relation  involving  P,  we  have 


ft-  trlg^S^P+PSj.+Sg-)]  +  tr  15^8^+55  j+5^)]. 

Restricting  ( )  to  N  (see  (4.14),  using  Lemma  4.6  and  noting  that  8^- and  8^- 
have  finite  rank,  it  follows  that  there  exists  > 0  such  that 

I Iftl I  <  c1l I (8a  ,8b  ,8c  )  I  I2.  (4.25) 

c  c  c 

Combining  fl  with  the  second-order  terms  in  A  yields  the  desired  result.  D 


Lemma  4.8.  A+  is  open. 

Proof.  From  the  "generic*  property  of  controllability  and 
observability  ([62],  p.  44)  there  exists  an  open  neighborhood  of  (AC'BC'CC) 
each  of  whose  elements  is  minimal.  Combining  this  fact  with  Lemma  4.5  yields  the 
desired  result.  □ 

Lemma  4.9.  Q2  and  P2  are  positive  definite. 

ncxnc 

Proof.  First  note  that  expanding  the  1R  -component  of  the  Lyapunov 

equation  (4.8)  yields  (4.50)  below.  By  a  minor  extension  of  results  from  [66]  or 
[67],  (4.50)  can  be  rewritten  as 


0  -  (»c+BcC0120*)02  *  <VVBoCS1202,T  +  Wc' 


where  Q*  is  the  Moore-Penrose  or  Drazin  generalized  inverse  of  Q2> 

+  i 

Next  note  that  since  (Ac,Bc)  is  controllable  then  so  is  ( Ac+BcCQi2Q2'BcV2  *  Now' 

m 

since  Q  and  B  V„B  are  nonnegative  definite,  it  follows  from  Lemma  12.2  of  [62] 

2  c  2  c 

that  Q2  is  positive  definite.  Similar  arguments  show  that  is  positive  definite.  □ 
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t.  -~i 


.W 


Having  established  Lemmas  4. 1-4. 9/  we  can  now  proceed  with  the  proof  of 


the  Main  Theorem.  Let  (Ac,Bc,Cc)  €  A_+  be  as  in  the  Main  Theorem  and  consider 

n  xn  n  xf  mxn 

(4.19)  with  (A^,B^,C^.)  confined  to  £+.  Because  .L:  R C  x  RC  x  R  C  -»  R  is  a 
bounded  linear  functional  and  A+  is  open,  the  convergence  in  (4.21)  implies 
that  I,  is  precisely  the  Frechet  derivative  of  J  with  respect  to  (Ac,Bc,Cc). 

Since  2V+  is  open,  the  optimality  of  (Ac,Bc,Cc)  implies 


L(5A  ,8b  ,*  )  -  0 
c  c  c 


for  all  (8a  ,8b  ,5c  ).  Clearly,  (4.26)  is  equivalent  to 
c  c  c 


(4.26) 


Z2  =  0,  (4.27) 

V2BcP2  +  CZ21  =  °»  (4.28) 


°2Cc*2  *  Z12B  *  °' 


Thus,  B  and  C  are  given  by 
c  c 


(4.29) 


IP 


B 

C 


(4.30) 


C 

c 


-R2VZ12Q2 


-1 


(4.31) 


Although  Bc  and  Cc  are  now  determined  in  terms  of  (T  and  Ac 
remains  to  be  found.  Moreover,  o'  and  P-  themselves  depend  (via  (4.8)  and  (4.10)) 
on  Bc  and  Cc»  Hence  our  task  now  is  to  consolidate  and  simplify  ( 4 . 7 )- (4 . 10 ) , 
(4.27),  (4.30)  and  (4.31)  to  obtain  the  more  tractable  conditions  (3.9)— (3.18) . 
To  this  end  let  us  define  new  variables 
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Clearly,  Q  and  P  are  nonnegative  definite  and  have  finite  rank.  Since  by  Lemma 

4.2  Q,P  (H,) ,  it  can  be  seen  that  ,P^  f  B.^  (H,) ,  which  implies  Q,P  € 

B  (H) .  To  show  that  Q  and  P  are  nonnegative  definite,  note  that  Q  is  the 

_  *  __ 

.B  ( H,) -component  of  the  nonnegative  definite,  operator  QQQ_  e  B.(H.) ,  where 


Q  ft 


I  -Q  Q-1 

H  U12U2 

0  -I 


Similarly,  P  is  nonnegative  definite. 


From  the  domain  conditions  (4.7)  and  (4.9)  it  follows  that 


Q1:  J2(A*)  -♦  D(A),  P  :  D(A) 


D ( A* ) , 


(4.34a,b) 


n  n  * 
Q12:1RC  -  D(A),  P12:  B.  C  -  D(A  ), 


which  lead  to  (3.12)  and  (3.13). 


Next  note  that  (4.27)  is  equivalent  to  (3.8)  with 


8  4  °214' 


and  that  (3.7)  holds  with 


M  *  QnP_ . 


2  2 


(4.35a,b) 


(4.36a,b) 


(4.37) 


Since  Q2  and  P2  are  positive  definite,  Lemma  2.6  implies  that  M  is  positive 

*  2 

semisimple.  We  can  also  define  r=  Gf  which,  by  (3.8)  satisfies  r  =  r.  It 


is  helpful  to  note  the  identities 


(4.38a,b) 


fi  -  012g  -  c*0*2,  i  =  -P12r  -  -r*p*2, 
0  =  g*q2g,  £  =  r*p2r, 


tg  =  g  ,  Tt  =  r , 


®  =  -Q  p‘  . 
W12  12 


(4.39a,b) 

(4.40a ,b) 
(4.41a,b) 

(4.42) 


From  (3.8)  and  (2.1)  it  follows  that 

p(G)  =  p(D  =  nc,  (4.43a,b) 

p(Q12)  =  p(P12)  =  «c*  (4.44a,b) 

Hence,  (2.2)  and  (4.38)  imply  nc  «  pfQ^^  +  “  nc  -  P^^  —  P^Q^2^  =  nc' 

which  yields  (3.14a).  Similarly,  (3.14b)  holds  and  (3.14c)  follows  from  (2.2)  and 
(4.42). 

Using  (4.38)  and  (4.39),  the  components  of  Q  and  P"  can  be  written  in 
terms  of  G,r,Q,P,$  and  $  as 


Qi  =  Q  +  £, 

p2  «  p  +  ^, 

(4.45) 

Q12  *  *  > 

P12  -  -As*, 

(4.46) 

q2  *  rti*. 

P2  *  GPG*. 

(4.47) 

Now  (3.10)  and  (3.11)  can  be  obtained  by  substituting  (4.45)-(4.47)  into  (4.30) 
and  (4.31). 

n  n  xnc 

Expanding  the  1(H),  B(IR  C,H)  and  IR  C  components  of  (4.8)  and  (4.10) 

yields 
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o  -  AQX  +  Q2A  +  BCcQ12  +  Q12fBCc>  +  Vlf 


(4.48) 


°  =  AQ12  +  Ql/C  +  BCcQ2  +  Q^C)*, 


(4.49) 


0  =  A  Q  +  QaJ  +  B  CQ  +  Q*  (B  C)*  +  B  V  V, 
c  2  2c  cl2  12  c  c  2  c 


(4.50) 


0  =  A  P2  +  P2A  +  (BcC)  P12  +  P12BcC  +  Kr 


(4.51) 


0  =  P12Ac  +  A  P12  +  <BcC)  P2  +  PlBCc, 


(4.52) 


T  *  *  m 

0  =  A  P_  +  P-A  +  (BC  )  P,_+  P^BC  +  C  R_C  . 
c  2  2c  c  12  12  c  c  2  c 


(4.53) 


Substituting  (4.45)— (4.47)  into  (4.48)— (4.53) ,  using  the  identities 


B  C  *  rQL,  BC  =  -XPG  , 
c  c 


B  V^BT  =  rQLQr* ,  CTR0C  =  GPXPG*, 
c  2  c  c  2  c 


and  defining 


=  A  -  QX  ,  Ap  -  A  -  XP, 


we  obtain 


0  =  AQ  +  QA*  +  Ap0  +  $Ap  +  V1# 


(4.54) 


0  =  [Ap6  +  QXQ  +  6(-T*A^G  +  XQ)jr*, 


(4.55) 


0  *  AG  kjt  +  QX$  +  QXQ  +  0(-T  A^G  +  XQ)]/’*, 


(4.56) 


_  *  *A  A 

0  «  A  P  +  PA  +  AqP  +  PAq  +  R1# 


(4.57) 


o  -  -ia*$  +  pxp  +  £(g*a  r  +  XP ) ] G* , 

Q  C 


(4.58) 


0  -  Gir  ATG^  +  PX^  +  PXP  +  $(G*A  /’+XP)  JG* . 


(4.59! 


We  are  now  in  a  position  to  determine  Ac  by  computing  (4.56)  -  /"(4.55) 
which  yields  (3.9).  Alternatively,  Ac  can  be  obtained  by  computing  (4.59)  + 

* 

G(4.58) .  As  mentioned  in  Section  3,  (3.9)  is  valid  since  G  :  B.  -»  D.(A)  and 

T 

A^  is  given  by  (3.26). 


Next  we  substitute  the  expressions  for  A  and  A  into  (4.55),  (4.56), 

c  c 


(4.58)  and  (4.59)  and  compute  the  relations  (4.55)G,  G  (4.56)G,  -(4.58)/’  and 
I* (4.59)/’  to  obtain,  respectively. 


0  = 


(4.60) 


0  =  r[Ap6  +  $A 


* 

P 


+  QXQ  ]  7* , 


(4.61) 


0  =  [A*£  +  $AQ  +  prp]r, 

*  *A  A 

0  *  T  [aqp  +  paq  +  pxp] r. 


(4.62) 

(4.63) 


Note  that  (4.60)-(4.63)  are  equivalent  to  (4.55),  (4.56),  (4.58)  and  (4.59)  since 

* 

G  and  r  have  full  rank.  Since  (4.61)  =  r(4.60)  and  (4.63)  *  r  (4.62),  (4.61) 

and  (4.63)  are  superfluous  and  can  be  omitted.  Thus  we  have  derived  (3.17)  and 
(3.18). 

To  obtain  (3.15)  and  (3.16)  we  need  only  compute  the  relations  (4.54)  + 
t( 4 . 60 )  -  (4.60)  -  (4.60)*  and  (4.57)  +  t*(4.62)  -  (4.62)  -  (4.62)*  and  use 
(4.41). 

Finally,  to  show  that  the  preceding  development  entails  no  loss  of 


generality  in  the  optimality  conditions  we  now  use  (3.9)— (3.18)  to  obtain 


(4.7)  —  (4.10)  and  ( 4 . 27 ) - ( 4 . 29 ) .  Let  A  ,B  ,C  ,G,T,  t,Q,P,£,$  be  as  in  the 

c  c  c 

theorem  statement  and  define  Qi'Qi2'^2,Pl,P12'I>2  (4.45)— (4.47) . 

Note  that  (3.12)  and  (3.13)  imply  (4.34)  and  (4.35)  and  hence  (4.7)  and  (4.9). 

Using  (3.8),  (3.10),  (3.11)  and  (3.22)  it  is  easy  to  verify  (4.27)— (4.29) . 

Finally,  substitute  (4.32),  (4.33)  and  (4.36)  into  (3.15)— (3.18) r  reverse  the  steps 
taken  earlier  in  the  proof  and  use  (3.9)— (3.11)  to  obtain  (4.8)  and  (4.10),  which 


completes  the  proof.  □ 


5.  CONCLUDING  REMARKS 


This  paper  has  considered  the  problem  of  quadratically  optimal, 
steady-state,  fixed-order  dynamic  compensation  for  linear  infinite-dimensional 
systems.  The  Main  Theorem  presents  the  stationarity  conditions  of  the 
optimization  problem  in  a  highly  simplified  and  rigorous  form.  The  ’optimal 
projection  equations’  (3.15)— (3.18)  (or,  equivalently,  (3.27)— (3.30) )  of  the  Main 
Theorem  reveal  the  essential  structure  of  the  first-order  necessary  conditions  and 
display  the  central  role  played  by  the  optimal  projection  r.  The  relationship  of 
the  Main  Theorem  to  the  standard  finite-dimensional  steady-state  I^G  problem  can 
be  demonstrated  by  replacing  r  with  the  identity  matrix  and  noting  that  (3.27)  and 
(3.28)  reduce  immediately  to  the  familiar  pair  of  operator  Riccati  equations  and 
that  (3.29)  and  (3.30)  yield  the  usual  controllability  and  observability  gramians. 

Inasmuch  as  the  Main  Theorem  is  a  fundamental  generalization  of 
classical  steady-state  LQG  theory,  a  number  of  issues  must  be  reexamined.  Hence, 
in  conclusion  we  should  like  to  point  out  some  possible  extensions  of  the  Main 
Theorem  along  with  directions  for  further  research. 

1.  Sufficiency  theory.  Although  sufficient  conditions  for  the 

existence  of  an  optimal  compensator  were  not  investigated  in  this 
paper,  auxiliary  conditions  based  upon  the  structure  of 
(3.15)-(3.18)  could  perhaps  be  imposed  upon  Q,  P,  6  and  P  to  single 
out  the  global  optimum  from  amongst  the  local  minima.  This  would 
be  similar  to  the  situation  in  LQG  theory  where,  under 
stabilizability  and  detectability  hypotheses,  optimal  stabilizing  Q 
and  P  are  identified  as  the  unique  positive  semidefinite  solutions 
of  the  pair  of  algebraic  Riccati  equations.  Second  and 
higher-order  optimality  conditions  appear  promising  in  this  regard 
and  are  currently  being  investigated. 


Stabilizability.  Just  as  in  the  full-order  LQG  problem,  one  would 
expect  a  natural  relationship  between  the  structure  of  the  optimal 
solution  and  stabilizability/detectability  hypotheses.  The  results 
of  [41],  [42]  and  [68]  could  serve  as  a  starting  point  in  this 
regard. 

Numerical  algorithms.  In  practical  situations,  the  distributed 
parameter  system  would  be  replaced  by  a  high-order  discretized  model 
for  which  the  matrix  version  (rather  than  the  operator  version)  of 
the  optimal  projection  equations  could  be  solved  numerically.  A 
numerical  algorithm  for  solving  the  matrix  version  of  the  optimal 
projection  equations  has  been  developed  in  [32]  and  [34].  The 
proposed  computational  scheme  is  fundamentally  quite  different  from 
gradient  search  algorithms  ([17,18,21,22,24,25,28,30])  in  that  it 
operates  through  direct  solution  of  the  optimal  projection  equations 
by  iterative  refinement  of  the  optimal  projection. 

Convergence.  One  of  the  principal  uses  for  the  optimal  projection 
equations  will  be  to  understand  the  relationship  between  fixed-order 
dynamic-compensator  designs  which  are  optimal  with  respect  to 
approximate  models  and  the  optimal  fixed-order  dynamic  compensator 
for  the  distributed  parameter  system  itself.  By  considering  a 
sequence  of  nth-order  approximate  models  which  converge  to  the 
distributed  parameter  system,  conditions  would  be  sought 
guaranteeing  that  the  sequence  of  fixed-order  compensators  based  on 
each  approximate  model  approach  the  optimal  dynamic  compensator 
based  upon  the  distributed  parameter  system  (see  [38-40],  This 
approach  is  analogous  to  the  convergence  results  obtained  in  [7,8] 
with  the  major  difference  being  that  the  optimal  projection 


equations  permit  the  order  of  the  compensator  to  remain  fixed  in 


accordance  with  real-world  implementation  constraints  whereas  in 
[7-9 J  the  order  of  the  compensator  increases  without  bound. 
Unbounded  control  and  observation.  An  important  generalization  of 
the  problem  considered  in  this  paper  involves  the  case  in  which  the 
input  and  output  operators  B  and  C  are  unbounded.  The  mathematical 
details  for  this  problem  are  considerably  more  complex  (see,  e.g., 
[69]). 

Singular  observation  noise/singular  control  weighting.  As  pointed 
out  in  [22,33,36]  the  assumptions  of  nonsingular  control  weighting 
and  nonsingular  observation  noise  preclude  the  use  of  direct  output 
feedback  as  in 

U ( t )  =  CcXc<t)  +  DcY<t)  (5.1) 

since  J  is  undefined  unless 

“‘“cWj1  -  0  (<*  R2DcV2  >  0). 

Although  with  due  attention  to  (5.1)  direct  output  feedback  can  be 
used  in  the  singular  case,  the  nature  of  the  problem  forebodes  all 
of  the  difficulties  associated  with  the  singular  LQG  problem.  Note 
that  the  deterministic  output  feedback  problem  ([70]),  when  viewed 
in  this  context,  is  highly  singular. 

Discrete-time  system/discre. ’-time  compensator.  Digital 
implementation  can  be  modelled  by  a  discrete-time  compensator  with 
control  of  a  continuous-time  system  facilitated  by  sampling  and 
reconstruction  devices.  See  [71]  for  results  in  this  direction. 
Cross  weighting/correlated  disturbance  and  observation  noise.  This 
extension  is  straightforward  and  entirely  analogous  to  the  LQG  case 
(see,  e.g.,  [18],  p.  351). 


V"  /  *  *'•  •>'"  .*•  •  -  v.  * 
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1.  Int  roduction 


This  paper  will  be  broad  in  scope  and  has  the  following  objectives: 

1.  to  discuss  the  underlying  philosophy  and  motivation  of  the  optimal 
projection/maximum  entropy  (OP /ME)  stochastic  modelling  and 
reduced-order  design  methodology  for  high-order  systems  with 
parameter  uncertainties; 

2.  to  present  a  rigorous  mathematical  development  of  the  principal 
design  results  for  reduced-order  modelling,  reduced-order  state 
estimation  and  reduced-order  dynamic  compensation  including  the 
effects  of  parameter  uncertainties;  and 

3.  to  contrast  this  approach  philosophically  and  technically  with 
several  alternative  methods  with  regard  to  capabilities  and 
limitations. 

The  basis  for  this  paper  is  references  [1-25]  along  with  recently-obtained  results. 

The  OP/ME  approach,  as  its  name  suggests,  represents  the  synthesis  of 
two  distinct  ideas:  (1)  reduced-order  system  design,  e.g.,  model,  state  estimator 
or  dynamic  compensator,  for  a  given  high-order  plant  (i.e.,  optimal  projection 
design)  and  (2)  minimum-information  stochastic  modelling  of  parameter 
uncertainties  (i.e.,  maximum  entropy  modelling.)  In  view  of  the  theme  of  the 
workshop,  the  overwhelming  emphasis  of  this  paper  will  be  on  maximum  entropy 
modelling.  However,  since  order  uncertainties  due  to  inadvertent  or  intentional 
system  truncation  (either  from  inf inite-to-f inite  or  f inite-to-f inite  dimensions) 
at  various  stages  in  the  design  process  are  also  pertinent  to  the  workshop,  the 
reduced-order  aspect  of  our  work  will  also  be  included.  Maximum  entropy  modelling 
is  discussed  in  [1-13,15]  and  optimal  projection  design  is  studied  in 
[6,10,12,14,16-25].  Before  discussing  maximum  entropy  modelling,  we  shall  briefly 
review  optimal  projection  design  for  high-order  systems,  assuming  for  the  moment 
an  absence  of  modelling  errors.  Although  mathematically  classical,  optimal 
projection  design  represents  a  novel  approach  to  the  reduced-order  modelling, 
reduced-order  state-estimation  and  reduced-order  dynamic-compensation  problems. 


This  approach  has  recently  been  vigorously  developed  and  it  should  be  noted  that 
[22]  is  scheduled  for  archival  publication  and  [23-25]  are  currently  under  similar 
review. 

The  optimal  projection  approach  is  based  entirely  on  a  series  of  three 

theorems  (these  can  be  obtained  as  corollaries  of  results  given  below)  which 

characterize  the  quadratically  optimal  reduced-order  model,  state  estimator  and 

dynamic  compensator.  Assuming  a  purely  dynamic  linear  system  structure  for  the 

desired  system  (model,  estimator  or  compensator)  whose  order  is  determined  by 

implementation  constraints  (e.g.,  reliability,  complexity  or  computing 

capability),  a  parameter  optimization  approach  is  taken.  There  is,  of  course, 

nothing  novel  about  this  approach  per  se  and  it  has  been  widely  studied  in  the 

control  literature  [26-39].  Although  the  model-reduction  problem  was  also  handled 

in  this  way  in  [40-42],  to  the  authors'  knowledge  the  widely-studied  reduced-order 

state-estimation  problem  ([43,44])  was  not  directly  approached  in  this  fashion. 

Clearly,  the  parameter  optimization  approach  fell  into  disrepute  because  of  the 

extreme  complexity  of  tne  grossly  unwieldy  first-order  necessary  conditions  which 

afforded  little  insight  and  engendered  brute  force  gradient  search  techniques. 

The  crucial  discovery  occurred  in  [6]  where  it  was  revealed  that  the  necessary 

condition  for  the  dynamic-compensation  problem  give  rise  to  the  definition  of  an 

optimal  progection  as  a  rigorous,  unassailable  consequence  of  quadratic  optimality 

without  recourse  to  ad  hoc  methods  as  in  [45-54].  Exploitation  of  this  projection 

leads  to  immense  simplification  of  the  "primitive"  form  of  the  necessary 

conditions  for  each  of  these  problems.  As  summarized  in  Figure  1,  the  modelling, 

estimation  and  compensation  design  equations  form  a  natural  progression: 

a  coupled  system  of  2,  3  or  4  matrix  equations  whose  solutions  determine  the 

desired  gains  (A  ,B  ,C  ),  (A  ,B  ,C  )  or  (A  .B  ,C  ).  The  novel  equations 
mmm  eee  c  c  c 

are  the  modified  Lyapunov  equations  for  the  reduced-order  modelling  problem, 
versions  of  which  arise  in  the  estimation  and  compensation  problems.  Since  the 
modified  Riccati  equations  are  analogous  to  the  standard  observer  and  regulator 
Riccati  equations,  the  optimal  projection  equations  for  the  reduced-order 
state-estimation  and  dynamic-compensation  problems  appear  to  provide  a  fundamental 
generalization  of  steady-state  Kalman  filter  and  LQG  theory. 
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Although  optimal  projection  design  deals  directly  and  rigorously  with 
the  question  of  system  dimension  by  trading  order  off  against  performance,  it  is, 
nevertheless,  predicated  upon  the  availability  of  a  completely  accurate  plant  and 
disturbance  model.  Maximum  entropy  modelling,  however,  addresses  the  robustness 
problem  by  permitting  direct  inclusion  of  parameter  uncertainties  in  the  plant  and 
disturbance  models  so  that  optimal  projection  design  plus  maximum  entropy 
modelling  automatically  yields  system  designs  (reduced-order  models,  state 
estimators  and  dynamic  compensators)  that  trade  performance  off  against  modelling 
uncertainties.  In  maximum  entropy  modelling,  uncertainties  are  modelled  at  their 
a  priori  levels  and  there  is  no  adaptation  or  learning  provided  for  in  the  design, 
i.e.,  the  control  is  nondual  ([55-57]). 

Before  attempting  an  overview  of  the  maximum  entropy  approach  it  is 
important  to  discuss  the  class  of  problems  that  motivated  this  work,  namely, 
control  of  large  flexible  space  structures.  A  finite-element  model  of  a  large 
flexible  space  structure  is,  generally,  an  extremely  high-order  system.  For 
example,  a  version  of  the  widely-studied  Draper  Model  #2  includes  150  modes  and  6 
disturbance  states,  i.e.,  a  total  of  306  states,  along  with  9  sensors  and  9 
actuators.  The  size  of  the  model  and  the  coupling  between  sensors  and  actuators 
renders  classical  control-design  methods  useless  and  all  but  confounds  attempts  to 
use  LQG  to  obtain  a  controller  of  manageable  order.  Indeed,  these  difficulties 
were  a  prime  motivation  for  the  optimal  projection  approach.  Besides  the  high 
order  of  these  systems,  finite  element  modelling  is  known  to  have  poor  accuracy, 
particularly  for  the  high-order  modes.  Reasonable  and  not  overly-conservative 
uncertainty  estimates  predict  30-50  percent  error  in  modal  frequencies  after  the 
first  ten  modes,  with  the  situation  considerably  more  complex  (and  pessimistic) 
for  damping  estimates.  Otherwise-successful  control-design  methodologies  widely 
promulgated  in  the  aerospace  community  were  severely  strained  in  the  face  of  such 
difficulties. 

Maximum  entropy  modelling  is  a  form  of  stochastic  modelling.  Although 
external  disturbances  are  traditionally  modelled  stochastically  as  random 
processes,  the  use  of  stochastic  theory  to  model  plant  parameter  uncertainty  has 
seen  relatively  limited  application.  We  seek  to  dispel  all  objections  to  a 
stochastic  parameter  uncertainty  model  by  invoking  the  modern  information- 
theoretic  interpretation  of  probability  theory.  Rather  than  regarding  the 
probability  of  an  event  as  the  limiting  frequency  of  numerous  repetitions  (as, 
e.g.,  the  number  of  heads  in  1,000  coin  tosses)  we  adopt  the  view  that  the 


probability  of  an  event  is  a  subjective  quantity  which  reflects  the  observer's 
certainty  as  to  whether  a  particular  event  will  or  will  not  occur.  This  quantity 
is  nothing  more  than  a  measure  of  the  information  (including,  e.g.,  all 
theoretical  analysis  and  empirical  data)  available  to  the  observer.  In  this  sense 
the  validity  of  a  stochastic  model  of  a  flexible  space  structure,  for  example, 
u_'es  not  rely  upon  the  existence  of  a  fleet  of  such  objects  (substitute  "ensemble" 
for  "fleet"  in  the  classical  terminology)  but  rather  resides  in  the  interpretation 
that  it  expresses  the  engineer's  certainty  or  uncertainty  regarding  the  values  of 
physical  parameters  such  as  stiffnesses  of  structural  components.  This  view  of 
probability  theory  has  its  roots  in  Shannon's  information  theory  but  was  first 
articulated  unambiguously  by  Jaynes  (see  [58-61]).* 

The  preeminent  problem  in  modelling  the  real  world  is  thus  the 
following:  given  limited  (incomplete)  a  priori  data,  how  can  a  well-defined 
(complete)  probability  model  be  constructed  which  is  consistent  with  the  available 
data  but  which  avoids  inventing  data  which  does  not  exist?  To  this  end  we  invoke 
Jaynes'  Maximum  Entropy  Principle:  First,  define  a  measure  of  ignorance  in  terms 
of  the  information-theoretic  entropy,  and  then  determine  the  probability 
distribution  which  maximizes  this  measure  subject  to  agreement  with  the  available 
data.**  The  reasoning  behind  this  principle  is  that  the  probability  distribution 
which  maximizes  a  priori  ignorance  must  be  the  least  presumptive  (i.e.,  least 
likely  to  invent  data)  on  the  average  since  the  amount  of  a  posteriori  learned 
information  (should  all  uncertainty  suddenly  disappear)  would  necessarily  be 
maximized.  If,  for  some  probability  distribution,  the  a  priori  ignorance  and 
hence  the  a  posteriori  learning  were  less  than  their  maximum  value  then  this 
distribution  must  be  based  upon  invented  and  hence,  generally  incorrect,  data. 

The  Maximum  Entropy  Principle  is  clearly  desirable  for  control-system  design  where 
the  introduction  of  false  data  is  to  be  assiduously  avoided. 

It  is  shown  in  [1]  that  the  stochastic  model  induced  by  the  Maximum 
Entropy  Principle  of  Jaynes  is  a  Stratonovich  multiplicative  white  noise  model. 
Rather  than  base  this  paper  on  a  demonstration  of  this  result,  whose  derivation 

♦The  reader  is  encouraged  to  peruse  [60]  (which  is  also  available  in  [61]) 
where  this  interpretation  of  probability  theory  is  discussed. 

**The  smallest  collection  of  data  for  which  a  well-defined  probability  model 
(called  the  minimum  information  model)  can  be  constructed  is  known  as  the 
minimum  data  set. 


is  confined  to  modal  (structural)  systems,  we  shall  arbitrarily  adopt  the 
Stratonovich  multiplicative  white  noise  model  and  proceed  by  exploring  its 
consequences  for  general  systems.  The  idea  of  inducing  a  stochastic  model  from 
limited  data  remains,  however,  fundamental  to  maximum  entropy  modelling  and 
figures  prominently  in  the  results  below. 

A  review  of  the  mathematical  and  control  systems  literature  on 
multiplicative  white  noise  is  absolutely  essential  for  communicating  the 
contribution  of  maximum  entropy  modelling.  The  theory  of  stochastic  differential 
equations  was  placed  on  a  firm  mathematical  foundation  by  Ito  ([62])  and  has  been 
widely  developed  and  applied  to  modelling,  estimation  and  control  problems 
([63-91]).  The  basic  linear  multiplicative  white  noise  model  is  given  by 


.  _  P  __  _  _ 

x"(t)  =  (A  +  y  a.  (t)A.  )x(t)  +  w(t),  (1.1) 

i=l  1 

where  X(t)e  jln  ,  X,  X^  e  Rnxn,  w(t)  is  zero-mean  Gaussian  white  disturbance 
noise  with  nonnegative-definite  intensity  V,  and  ec^(t)  are  zero-mean,  unit- 
intensity  Gaussian  white  noise  processes  which  are  mutually  uncorrelated  and 
uncorrelated  with  wit).  The  multiplicative  white  noise  model  (1.1)  can  be 
regarded  as  a  parameter  uncertainty  model  where  each  c^(t)  corresponds  to  a 
single  uncertain  parameter  whose  pattern  and  magnitude  are  given  by  X^/| |Xj  | 
and  IIA^Ii,  respectively. 

To  see  why  (1.1)  is  a  minimum  information  model  of  parameter 
uncertainty,  note  that  when  the  pattern  A^/H^ll  of  an  uncertain  parameter  is 
known,  all  available  data  (theoretical  and  empirical)  can  be  brought  to  bear 
('boiled  down*)  to  determine  its  magnitude  | | A^ | | .  Clearly,  the  collection  of 
magnitudes  constitutes  the  minimum  data  set  needed  to  render  (1.1)  well  defined. 
For  the  harmonic  oscillator  with  uncertain  natural  frequency,  the  uncertainty 
magnitude  is  given  by  the  reciprocal  of  the  relaxation  time  (see  Figure  2). 

To  eliminate  the  white  noise  formalism,  the  model  (1.1)  is  usually 
rigorized  by  the  Ito  differential  equation 

p 

dx"  =  (Xdt  +  Y]  da.  X.  )  x"  +  dw.,  (1.2) 

t  it  i  t  t 
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where  dotfc  and  dwfc  are  Brownian  motions,  i.e.,  Wiener  processes.  Although 
such  models  were  studied  extensively  for  estimator  and  control  design  ([72-88]}, 
this  approach  fell  into  disrepute  with  the  publication  of  [90,91]  where  it  was 
shown  for  discrete-time  systems  that  sufficiently  high  uncertainty  levels  (i.e., 
magnitudes  ||A^||  above  a  threshold)  lead  to  the  nonexistence  of  a  steady  state 
solution.  Although  it  was  purported  in  [90]  that  this  "phenomenon"  was  an 
"obvious"  consequence  of  high  uncertainty  levels,  these  conclusions  failed  to  take 
into  account  (possibly  because  of  the  discrete-time  setting)  the  subtle 
relationship  between  the  ordinary  differential  equation  (1.1)  and  the  stochastic 
differential  equation  (1.2).  Indeed,  it  was  shown  in  [63]  that  if  a  stochastic 
differential  equation  is  regarded  as  the  limit  of  a  sequence  of  ordinary 
differential  equations,  then  (1.2)  is  not  the  correct  version  of  (1.1).  Instead, 
the  ordinary  differential  equation  (1.1)  with  multiplicative  white  noise 
corresponds  to  the  corrected  Ito  differential  equation 


(1.3) 


(1.4) 


which  differs  from  the  "naive"  equation  (1.2)  by  a  systematic  drift  term. 

Although  skepticism  regarding  this  unusual  result  was  admitted  to  in  [63],  the 
form  of  (1.3)  was  corroborated  completely  independently  by  Stratonovich  in  [64], 
whose  results  actually  appeared  in  the  Russian  literature  prior  to  1965.  His 
approach  is  based  upon  an  alternative  definition  of  stochastic  integration  which 
differs  from  Ito  stochastic  integration  by  a  mathematical  technicality.  The 
Stratonovich  approach,  it  should  be  noted,  has  the  interesting  feature  that 
approximating  sums  involve  future  values  of  a  Brownian  motion  process  which, 
although  physically  unacceptable  in  the  classical  view  of  probability,  is 
completely  consistent  with  the  information-theoretic  interpretation. 

In  spite  of  the  glaring  technicality  of  the  Stratonovich  correction, 
almost  all  research  on  the  estimation  and  control  of  such  systems  failed  to 
perceive  its  physical  significance.  To  the  authors'  knowledge,  the  work  of 
Gustafson  and  Speyer  [88]  was  the  only  paper  prior  to  the  appearance  of  [1]  which 
demonstrated  the  crucial  feature:  The  Stratonovich  correction  neutralizes  the 
threshold  uncertainty  principle!  We  shall  now  proceed  to  demonstrate  this  fact  by 
means  of  a  compelling  example. 
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First,  suppose  that  zero-point  deviations  of  x"(t)  are  of  interest  and 
are  evaluated  according  to 


J  =  lim  lf(t)T^r(t)  =  lim  tr'Q(t)'R, 

t — >00  t— ,£» 


(1.5) 


where  ~R  €  Rnxn  an(j 


qU)  =  e[3T(  t)lT(  t)T] . 


(1.6) 


As  will  be  seen,  (1.5)  can  arise  in  either  reduced-order  modelling,  reduced-order 
state  estimation  or  reduced-order  dynamic-compensation  problems.  The  obvious  fact 
cannot  be  over  emphasized  that  the  sole  state  statistic  of  design  interest  is  the 
state  covariance  (1.6).  From  Ito  calculus  it  follows  that  (J(t)  is  given  for  the 
naive  model  (1.2)  by 


Q(t)  =  A2U)  +  Q'(t)'AT  +  52  A.'QUJA.  +  V 

i=l  1  1 

and  for  the  corrected  model  (1.3)  by 

.  .p  P 

<f(t)  *  A  Q(  t )  +  Q(t)Ao  +  V  1  Qlt)A.  +  V. 
S  s  “i  1  1 


(1.7) 


(1.8) 


Each  of  these  stochastic  Lyapunov  differential  equations  should  be  regarded  as 
rT(rr+l)/2  ordinary  differential  equations.  The  question  we  wish  to  address  is  the 
following:  How  do  the  solutions  of  the  stochastic  Lyapunov  equations  (1.7)  and 
(1.8)  differ  from  each  other  and  the  "deterministic*  Lyapunov  equation 


Q(t)  *  AQ(t)  +  Q(t)'XT  r  V, 


(1.9) 


particularly  in  the  presence  of  high  uncertainty  levels?  The  answer  to  this 
question  of  course  depends  upon  the  stochastic  modification  terms  which  for  the 
naive  model  are  given  by 


2L,(Q(t)J  =  V  A.QUJ'A1 
i=l  1 


(1.10) 
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and  for  the  corrected  model  by 


J^lQtt)]  =  [-^UU)  +  ^(t)A-T  +  A.U(t)A^l, 

i=l  *’  *  1 


(1.11) 


so  that 


We  now  consider  a  system  consisting  of  a  pair  of  lightly-damped  modes 


A  = 


“l 

0 

0 


~2r>l 

0 

0 


0 

0 

0 

GJL 


0 

0 

-<o2 

-2Vi 


whe 


re  =  £^cu^.  To  represent  frequency  uncertainties  let 


A,  =  y. 


-1 

0 

0 

0 


0 

0 

0 

0 


=  y. 


0 

0 

0 

0 


0 

0 

0 

1 


0 

0 

-1 

0 


where  for  simplicity  we  have  ignored  the  effects  of  frequency  uncertainties  on  the 
effective  decay  rate  r The  magnitudes  of  the  uncertainties  are  scaled  by  means 
of  y^  and  y^.  For  this  example  the  naive  stochastic  modification  has 
the  form 


2yQ<t)i  = 


i 

‘-1 

.  *. 

yl° 22(t) 

0 

0 

'A 

\\ 

rfoi<t> 

0 

0 

£ 

.  •  i 

at 

0 

0 

r&4<t> 

-y^34(t) 

0 

0 

y?Q„(fc) 

o' 

Although  the  off-diagonal  terms  have  a  stabilizing  effect,  it  is  clear 


that  the  diagonal  elements  destabilize  the  state  variances.  Hence,  it  is  not 
surprising  that  for  sufficiently  high  uncertainty  levels,  i.e.,  7^  »  0,  the 
naive  model  is  second-moment  unstable.  These  observations  are  completely  in 
accordance  with  the  threshold  uncertainty  principle.  The  corrected  stochastic 
modification,  however,  has  the  form 


nl522(t,-5n(tn 

-2yJSi2(t) 

2t'yl+’y2,Q14(U 

-2yjui2(t> 

y^I^lltt) "^22^  ^ 

y*yi+y2^23^ ^ 

2(yi+>'2,‘,24(t’ 

'|<y2+y2)Q31(u 

2(V72)Q32(t) 

y^Ul-QjjUII 

-2'>'2534lt) 

-1  2  2  — 

— (v  +v  )Q  (t) 

2  71  71  U42 

-2?&4(tl 

>'2l'J33U,-544,t)1 

which  also  has  stabilizing  off-diagonal  elements  but  has  fundamentally  different 
diagonal  elements:  Rather  than  destabilizing  the  state  variances,  the  diagonal 
elements  of  the  corrected  stochastic  modification  are  equilibrating.  This  effect 
is  even  more  striking  when  and  jk  are  transformed  into  the  basis  with 
respect  to  which 


-j<u1-771  0 

0  jajl-1?l 

0  0 


0 


0 


0 

0 

-j<U2^2 

0 


0 

0 

0 


where  higher-order  terms  in  T)  have  been  ignored.  In  this  basis,  the  diagonal 
terms  of  H^tQXt)]  are  destabilizing  whereas  the  diagonal  terms  of  M^tQ^t)] 
exactly  vanish. 


The  negative  coefficients  in  the  off-diagonal  terms  imply  progressive 
decorrelation  between  pairs  of  dynamical  states.  This  informational  or 
statistical  damping  phenomenon  is  a  direct  result  of  parameter  uncertainties  that 


is  captured  by  the  multiplicative  white  noise  model,*  The  Stratonovich 
correction,  moreover,  is  crucial:  By  neutralizing  the  threshold  uncertainty 
principle,  it  permits  the  consideration  of  long-term  effects  for  arbitrary 
uncertainty  levels. 


The  far-reaching  ramifications  of  these  observations  are  explored 
extensively  in  [1-10].  As  an  example,  assume  (as  is  usually  the  case  in  practice) 
that  uncertainties  in  modal  frequency  obtained  from  a  finite-element  analysis  of  a 
large  flexible  space  structure  increase  with  mode  number.  From  the  form  of 
Mc[Q(t)]  it  is  easy  to  deduce  that  the  steady-state  covariance 

Q  =  lim  Q(t) 


satisfying 


0  =  A  Q  +  QAT  +  V  A.QAT  +  V 
S  S  1  1 

1=1 


(1.12) 


becomes  increasingly  diagonally  dominant  with  increasing  frequency  and  thus 
assumes  the  qualitative  form  given  in  Figure  3.  The  benefits  of  this  sparse  form 
are  important:  The  computational  effort  required  to  determine  the  steady-state 
covariance  (and  thus  to  design  a  closed-loop  controller,  for  example)  is  directly 
proportional  to  the  amount  of  information  reposed  in  the  model  or,  equivalently, 
inversely  proportional  to  the  level  of  modelled  parameter  uncertainty.  This  casts 
new  light  on  the  computational  design  burden  vis-a-vis  the  modelling  question: 

The  computational  burden  depends  only  upon  the  information  actually  available.  A 
simple  control-design  exercise  involving  full-state  feedback  illustrates  this 
point.  The  gains  for  the  higher-order  modes  of  the  beam  in  Figure  4,  whose 
frequency  uncertainties  increase  linearly  with  frequency,  were  obtained  with 
modest  computational  effort  in  spite  of  "n  =  100  (see  Figure  5).  Another  important 
ramification  of  the  qualitative  form  of  Q"  is  the  automatic  generation  of  a 
high/low-authority  control  law.  Note  that  for  the  higher-order  and  hence 
highly-uncertain  modes  the  control  gains  indicate  an  inherently-stable,  low 
performance  rate-feedback  control  law,  whereas  for  the  lowest-order  modes  the 
control  law  is  high  authority,  i.e.,  ’LQ*  in  character. 

‘Needless  to  say,  these  qualitative  effects  are  statistical  and  do  not  refer  to 
what  a  particular  structure  sitting  in  a  test  facility  or  on-orbit  is  actually 
doing! 
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The  major  design  results  are  the  first-order  necessary  conditions  for 

optimality  in  the  presence  of  maximum  entropy  modelling.  For  the  reduced-order 

modelling  problem,  the  following  notation  is  needed:  x£Rn,  y  e  r/;  w€Rm  is 

n  *  ~  ~ 

white  noise  with  intensity  V>0;  nm~n,  x^e  R  m,  y^  f  R* ,  A,  A^,...,  Ap,  B,  c, 

A  ,  B  ,  C  and  R  where  R  >0  are  real  matrices  whose  dimensions  are  consistent 
m  m  m 

with  their  context  below;  and  a^,...,  ap  are  zero-mean  unit-variance  mutually 

uncorrelated  scalar  white  noises  which  are  also  uncorrelated  with  w.  In  this 

problem,  the  order  n  and  structure  of  the  reduced-order  model  are  fixed  and  the 

m 

problem  is  concerned  with  determining  A  ,  B  and  C  . 

mm  m 

Optimal  Reduced-Order  Modelling  Problem  (ROM).  Given  the  model 


x  =  (A  +V]  a.A.)x  +  Bw, 
1=1 

y  =  Cx 


(2.1) 


(2.2) 


design  a  reduced-order  model 


x  *  Ax  +  Bw, 
mm  m 

y  =  C  x 
m  mm 


(2.3) 


(2.4) 


which  minimizes  the  model-reduction  criterion 


J ( a  , b  ,  c  )  =  lim  e [ ( y-y  )  R(y-y  )]. 
m  m  m  =  m  m 

t-»00 


(2.5) 


For  the  reduced-order  state-estimation  problem,  let  x,  y.  A,  C  and 

a  ,a  be  as  in  the  reduced-order  modelling  problem.  Furthermore,  assume 
1  P  a 

C,,...,C  have  the  same  dimensions  as  C;  w, €  Rn  and  w  £  R*  are  white 
1  p  1=2  = 

noises  uncorrelated  with  the  o^'s  with  intensities  £  0  and  V2>0, 

0  n 

respectively,  and  cross  intensity  V^2  c  j*nx*  •  ne-n»  xe€  =  6  *  Yec  R.  , 

R€Rqxq;  and  A  ,  B  and  C  are  matrices  of  suitable  dimension. 

—  e'  e  e 


In  this  formulation  the  matrix  L  identifies  the  states,  or  linear  combinations  of 


states,  whose  estimates  are  desired.  The  order  n  of  the  estimator  state  x 

e  e 

is  considered  fixed  and  is  determined  by  implementation  constraints,  i.e.,  by  the 
computing  capability  available  for  realizing  (2.3)  and  (2.4)  in  real  time.  Hence 
the  problem  is  concerned  with  determining  Ag,  Bg  and  Cg. 

Optimal  Reduced-Order  State-Estimation  Problem  (ROSE).  Given  the 
observed  system 


F 

x  *  (A+  ^2  a. A.  )x  +  w  , 

i=l  1  1  1 


(2.6) 


y  -  (C+23  a.c  )  +  w  , 

i=l  1  ^ 


(2.7) 


design  a  reduced-order  state  estimator 


x  =  A  x  +  B  y, 
e  e  e  e 


y  *  C  x 
e  e  e 


(2.8) 

(2.9) 


which  minimizes  the  state-estimation  criterion 


J(A  ,B  ,C  )  =  lim  E[ ( Lx-y  )TR(Lx-y  )], 
e  e  e  t_  e  e 


(2.10) 


To  state  the  reduced-order  dynamic-compensation  problem,  let  x.  A, 


A^,  ...,  Ap,  C,  C^,  ...,  Cp,  y,  w^,  V^,  and  a^,  ...,  «p  be 


as  in  the  reduced-order  state-estimation  problem.  Furthermore,  assume  u  e  R  , 


nxm 


B  ej*  ;  R^,  R^2  and  R2  are  matrices  of  suitable  dimension  such  that 

•"1  T 

R.  —  0,  R  i  0  and  R^R.^R.  R._  >0;  n  In;  and  A  ,  B  and  C  are  matrices 
1  2  1  12  2  12  c  c  c  c 


of  appropriate  dimension.  We  require  the  technical  assumption  that,  for  each  i, 


B.  #  0  implies  =  0.  In  this  problem,  the  order  nc  is  considered  fixed 


and  is  determined  by  implementation  constraints.  The  dynamic  compensator  is 
assumed  to  have  purely  dynamic  linear  structure  and  the  problem  is  concerned  with 
determining  A  ,  B  and  C  . 


Optimal  Reduced-Order  Dynamic-Compensation  Problem  (RODC).  Given  the 


control  system 


P  P 

&  =  (A+Y'a. A.)x  +  (B+£a.B.)u  +  w  , 
■  i  1  1  i  ii  i 


y  =  (C+Va.C^x  +  w, 
a  =  l 


design  a  reduced-order  dynamic  compensator 


x  =  A  x  +  B  y, 
c  c  c  c 

u  =  C  x 
c  c 


which  minimizes  the  performance  criterion 


J ( A  ,B  ,C  )  *  lim  E[xTR,x  +  2xTR,_u  +  uTR-u]. 
C  C  C  .  —  1  12  2 

t— °o 


(2.11) 


(2.12) 


(2.13) 


(2.14) 


(2.15) 


Let  n  and  (A  ,B  ,C  )  generically  denote  n  ,  n  ,  n  and 
it  r  r  r  in  6  c 

(A  , B  ,C  ),  (A  , B  , c  ),  (A  ,B  ,c  ).  To  guarantee  that  J  is  finite  and 
mmm  eee  ccc 

independent  of  initial  conditions,  we  restrict  (Ar,Br,Cr)  to  the  (open)  set  of 
second-moment-stable  triples 


S  &  {(A  ,B  ,C  ):  A  ®  I  ^  +  I  .  ®  A  +  Y\  A. ® A .  is  stablel, 

—  ’  r  r  r  s  n+n  n+n  siH.ii  » 

r  r  i=l 


where  Iv  is  the  vxv  identity  matrix,  ®  denotes  Kronecker  product  (see  [120]), 


A  =  A  +  \  f*  A2 

8  2  ft  1 


To  avoid  degeneracy  in  the  derivation  of  the  main  results,  we  further  restrict 
(without  loss  of  generality)  {Ar,Br,Cr)  to 

S (  =  |(Ar,Br,Cr)  c  S_:  (Ar,Br>  is  controllable  and 

(Ar,Cr>  is  observablej. 

Call  a  square  matrix  positive  semisimple  if  it  has  positive  eigenvalues 
and  a  diagonal  Jordan  canonical  form,  i.e.,  if  it  is  similar  to  a  positive 
diagonal  (or,  equivalently,  a  positive  definite)  matrix.  Let  p(Z)  denote  the  rank 
of  a  matrix  Z.  The  following  result  is  an  immediate  consequence  of  Theorem  6.2.5, 
p.  124  of  [113]. 

Lemma  2.1.  If  nxn  $  are  nonnegative  definite  and  p(0$)  *  nf  then 
there  exist  nfxn  G,r  and  nrxnf  positive-semisimple  M  such  that 


For  convenience  in  stating  the  main  results,  we  shall  refer  to  nfxn 
G,r  and  nrxnr  positive-semisimple  M  satisfying  (2.16)  and  (2.17)  as  a 
(G,M,D-f  actorization  of  QP.  Also,  we  shall  utilize  the  compact  notation 
illustrated  by 


m  t'  m  tr 

AQA  =  £  A.QA.,  AQ B  ^  £  A  QB  ,  etC* 

i=l  11  i=l 

Hence,  define  for  nonnegative-definite  matrices  Q,  P,  0  and  P  and  a  (G,M,D- 

aa 

factorization  of  QP, 


A  =  A  +  “A2,  B  =  B  +  ~AB,  C  =  C  +  ~CA, 

s  2r-  s  2^—  s  2 — 

R_  =  R-  +  BT(P+^)B,  V  =  V,  +  C(Q+0)CT, 

2S  2  —  —  2S  2  —  — 

^  =  QC^  +  V12  +  A(Q+Q)£T,  P.s  =  B^P  +  R^  +  B^P+^JA, 

T  m  G  V,  T  =  I  -  T. 

1  n 

Theorem  2.1.  Suppose  (Am, B  ,C  )  e  solves  the  optimal 
reduced-order  modelling  problem.  Then  there  exist  nxn  nonnegative-definite 
matrices  Q  and  P  such  that,  for  some  (G,M,D-f actorization  of  QP,  A^,  B^  and 
are  given  by 

A  =  TA  GT,  (2.18) 

m  s 


B  =  TB 


(2.19) 


V 


and  such  that  the  following  conditions  are  satisfied: 


A  A  T  T  T  T 

0  =  AO  +  QA  +  BVB  -  TBVB  T  , 
s  s  i  i 


(2.21) 


TA  A  T  T  T 

0  =  AgP  +  PAg  +  C  RC  -  T^C  RCT, 


(2.22) 


p(Q)  =  p(P)  =  P(QP)  =  nm.  (2.23) 

Theorem  2.2.  Suppose  (Ae,Be,Cg)  e  £5+  solves  the  optimal 
reduced-order  state-estimation  problem.  Then  there  exist  nxn  nonnegative-definite 
matrices  Q,  Q  and  P  such  that,  for  some  (G,M,/')-f actorization  of  Ag,  B^ 
and  Ce  are  given  by 


-1  T 

A  =  r(A  -Q  V.  C  )G  , 
e  s  -*s  2s  s 


b  =  ro  v"1, 
e  ^  2s 


C  =  LG  , 
e 


and  such  that  the  following  conditions  are  satisfied: 


v  +  0*3  *  Ma+S)»T  *  Vj  - 


A  A  T  -IT  -ITT 

AfiQ  +  QAS  +  V2S  ^  ~  WWi  ' 


0  (As^sV2sCs)TP  +  ^VV^sV  +  ^  "  tJlTRLti# 


,A.  ,A.  ,AA> 

P( Q )  =  p(P)  =  p (QP)  =  n  . 


(2.24) 


(2.25) 


(2.26) 


(2.27) 


(2.28) 


(2.29) 


(2.30) 


Theorem  2.3.  Suppose  ( Ac, Bc,Cc ) e  S^  solves  the  optimal 

reduced-order  dynamic-compensation  problem.  Then  there  exist  nxn 

A  A 

nonnegative-definite  matrices  Q,  P,  Q  and  P  such  that,  for  some  (G,M,D- 
AA 

factorization  of  QP,  Ac,  Bc  and  Cc  are  given  by 


-1  -1  T 

A  =  r( A  -B  R„  P  -O  V,  C  )G  , 
C  S  S  2s— S  -*-S  2S  S 


(2.31) 


B  = 


r-2sV2 


-1 


(2.32) 


-1  T 
C  =  -R_  P  G  , 
c  2s— s 


(2.33) 


and  such  that  the  following  conditions  are  satisfied: 

0  '  V  *  QAs  +  r  +  V1  +  -  as#  *  riS,V-^,(2.34) 

T  T  -1  TA  -1  T  -1  T  T  -1 

0  -  ASP  +  PAs  +  A  PA  +  R2  +  (A rfl>v2ec)  P(A-^v2sc)  -  +  tipsR2spsti,(2.35) 


°  =  (As-BsK;k,S  +  “(As-BsRi)T  +  ‘ 


(2.36) 


-1  TA  A  -1  T  -1  T  T  -1 

0  *  (As%V2sCs)  P  +  P(A-^  "  “  “  “ 


-QV„  C  )  +  P  R  p  -  T  p  R  p  r  , 
s  -*3  2s  s  — s  2s— s  s  2s— s  ± 


(2.37) 


A  A  AA 

P(Q)  =  P( P)  =  P ( QP )  =  n 


(2.38) 


Remark  2.1.  Because  of  (2.6)  the  nxn  matrix  T  which  couples  the  design 

2 

equations  for  each  problem  is  ldempoterrt,  i.e.,  T  =  T.  In  general  this 
•optimal  projection*  is  an  oblique  projection  (as  opposed  to  an  orthogonal 
projection)  since  it  is  not  necessarily  symmetric.  Note  that  from  Sylvester's 
inequality  it  follows  that  p(T)  =  nf . 

Remark  2.2.  Since  is  nonnegative  semisimple  it  has  a  group 
AA  #  T  -1 

generalized  inverse  (QP)  given  by  G  M  r  (see,  e.g.,  [114],  p.  124).  Hence 
by  (2.6)  the  optimal  projection  T  is  given  in  closed  form  by 


AA  AA  # 
T  =  QP(QP) 


(2.39) 


Remark  2.3.  The  Kalman  filter  and  LQG  results  can  be  obtained  as 
special  cases  of  Theorems  2.2  and  2.3. 

Remark  2.4.  The  following  effects  of  the  multiplicative  white  noise 
model  are  evident:  (1)  diagonal  dominance  in  Q  and  P  for  modal  systems  and  (2) 
suppressed  system  gains  due  to  the  definitions  of  V2s  and  R2g. 

Rema rk  2.5.  The  formulation  of  the  necessary  conditions  for  feedback 
control  in  the  presence  of  multiplicative  white  noise  as  a  system  of  four  matrix 
equations  was  discovered  independently  by  Hyland  [5]  and  Milshtein  [89]. 


Design  Capabilities  and  Limitations 


In  an  attempt  to  minimize  confusion  in  evaluating  practicable  design 
methodologies,  we  roughly  categorize  the  pertinent  issues  according  to  (1)  design 
tradeoffs,  (2)  modelling  validity  and  (3)  design-procedure  complexity.  The 
purpose  of  this  section  is  to  stress  the  multifaceted  nature  of  the  design  process 
and  to  provide  a  setting  in  which  to  evaluate  OP /ME  design  synthesis  and 
alternative  methods. 

3. 1  Design  Tradeoffs 

For  control  system  design  there  are  well-known  tradeoffs  among,  for 
example,  modelling  validity,  performance,  stability,  robustness,  design-procedure 
complexity  and  implementation  complexity.  Although  their  relative  importance  is 
highly  problem  dependent,  we  feel  that  ultimately  the  most  important  attribute  of 
a  design  methodology  is  not  its  ability  to  meet  isolated  design  objectives  but 
rater  its  ability  to  efficiently  and  optimally  quantify  tradeoffs  among  design 
objectives.  Within  the  validity  of  its  modelling  formalism,  optimal 
p rogecti on/max imum  entropy  design  synthesis  facilitates  performance/robustness  and 
performance/implementation  complexity  tradeoffs. 

3.2  Modelling  Validity 

Stochastic  linear  ordinary  differential  equations  with  additive  and 
multiplicative  white  disturbances  permit  representation  of  parameter  uncertainties 
and,  possibly  mild  nonlinearities.  Treatment  of  nonwhite  disturbances  is  possible 
by  means  of  system  augmentation,  and  extension  to  distributed  parameter  systems 
has  been  demonstrated  ([21]).  The  quadratic  performance  criteria,  although 
inappropriate  for  certain  classes  of  problems,  is  particularly  relevant  to 
pointing  and  shaping  of  precision  space  structures.  The  state-space  setting 
appears  to  permit  more  highly  structured  treatment  of  parameter  errors  than 
frequency  domain  approaches.  The  ramifications  of  the  Stratonovich  white-noise 
model,  however,  remain  to  be  explored  for  general  systems. 


3.3  Design-Procedure  Complexity 

Evaluating  advantages  and  disadvantages  of  a  given  design  procedure  is 
extremely  complex  because  of  dependence  upon  the  underlying  problem,  design  goals, 
design  constraints,  modelling  validity  and  available  design  resources  (human  and 
computational).  The  underlying  basis  of  the  OP/ME  design  philosophy  is  to 
quantify  available  information,  collocate  available  design  variables  and  mechanize 
the  design  process  to  the  greatest  possible  extent.  This  view  is  motivated  by 
high-order  systems  with  numerous  strongly  coupled  sensors  and  actuators. 
Computational  algorithms  for  solving  the  optimal  projection  equations  have  been 
developed  in  [14,17].  The  principal  difficulty  arises  from  the  presence  of 
multiple  extrema  as  a  consequence  of  reduced-order  design.  As  alluded  to 
previously,  however,  uncertainty  effects  tend  to  simplify  computations. 


Distributed  parameter  systems 
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INFORMATION  REGIMES 


Qualitative  Structure  of  Steady-State  Covariance 

Q  *  lim  Q(t),  Q(t)  ft  E  [x(t)x(t)T] 
t— • »  -  L  J 

(Frequency  uncertainties  increase  with  node  number) 


Figure  3. 
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Abstract 


First-order  necessary  conditions  for  optimal,  steady-state,  reduced-order 
state  estimation  for  a  linear,  time- invariant  plant  in  the  presence  of  corre¬ 
lated  disturbance  and  nonsingular  measurement  noise  are  derived  in  a  new  and 
highly-simplified  form.  In  contrast  to  the  lone  matrix  Riccati  equation  arising 
in  the  full-order  (Kalman  filter)  case,  the  optimal  steady-state  reduced-order 
estimator  is  characterized  by  three  matrix  equations  (one  modified  Riccati  equa¬ 
tion  and  two  modified  Lyapunov  equations)  coupled  by  a  projection  whose  rank 
is  precisely  equal  to  the  order  of  the  estimator  and  which  determines  the 
optimal  estimator  gains.  This  coupling  is  a  graphic  reminder  of  the  suboptimality 
of  proposed  approaches  involving  either  model  reduction  followed  by  "full-order" 
estimator  design  or  full-order  estimator  design  followed  by  estimator-reduction 
techniques.  The  results  given  here  complement  recently-obtained  results  which 
characterize  the  optimal  reduced-order  model  by  means  of  a  pair  of  coupled  modi¬ 
fied  Lyapunov  equations  ([7])  and  the  optimal  fixed-order  dynamic  compensator  by 
means  of  a  coupled  system  of  two  modified  Riccati  equations  and  two  modified 
Lyapunov  equations  ([6]). 


1. 


Introduction 


It  has  recently  been  shown  (see  [1-7])  that  the  first-order  necessary  con¬ 
ditions  for  the  problems  of  optimal  model  reduction  and  optimal  fixed-order 
dynamic  compensation  can  be  formulated  in  terms  of  an  "optimal  projection" 
matrix  which  arises  as  a  direct  consequence  of  optimality.  These  necessary 
conditions,  by  virtue  of  their  remarkable  simplicity,  yield  insight  into  the 
structure  of  the  optimal  design  and  permit  the  development  of  alternative  numer¬ 
ical  algorithms  ([2,4,7]).  The  purpose  of  this  note  is  to  develop  analogous 
first-order  necessary  conditions  for  the  reduced-order  state-estimation  problem. 
Since  this  problem  falls  midway  between  the  problems  of  open-loop  model  reduc¬ 
tion  and  closed-loop  fixed-order  dynamic  compensation,  it  is  not  surprising 
that  the  necessary  conditions  for  these  problems  are  correspondingly  related. 
Specifically,  while  the  optimal  projection  equations  for  model  reduction  con¬ 
sist  of  a  system  of  two  matrix  equations  (a  pair  of  modified  Lyapunov  equations) 
and  the  optimal  projection  equations  for  fixed-order  dynamic  compensation  com¬ 
prise  a  system  of  four  matrix  equations  (a  pair  of  modified  Lyapunov  equations 
plus  a  pair  of  modified  Riccati  equations),  the  optimal  projection  equations 
for  reduced-order  state  estimation  form  a  system  of  three  matrix  equations 
(a  pair  of  modified  Lyapunov  equations  along  with  a  single  modified  Riccati 
equation) .  In  each  case  the  system  of  matrix  equations  is  coupled  by  an  oblique 
projection  (idempotent  matrix)  which  determines  the  gains  of  the  optimal  reduced- 
order  system,  whether  it  be  a  model,  estimator  or  compensator. 

The  need  for  designing  an  optimal  reduced-order  state  estimator  for  a 
high-order  dynamic  system  follows  directly  from  real-world  constraints  on  com¬ 
puting  capability.  A  further  motivation  is  the  fact  that  although  a  system  may 
have  many  degrees  of  freedom,  it  is  often  the  case  that  estimates  of  only  a  small 
number  of  state  variables  are  actually  required.  In  the  face  of  these  practical 
motivations,  numerous  approaches  to  designing  reduced-order  state  estimators  have 
been  proposed.  See  [8]  for  a  recent  review  of  previous  results. 

An  important  fact  pointed  out  in  [8]  and  [9]  is  that  reduced-order  estima¬ 
tors  designed  by  means  of  either  model  reduction  followed  by  "full-order"  state 
estimation  or  full-order  estimation  followed  by  estimator  reduction  will  not  be 
optimal  for  the  given  order.  In  the  present  paper  this  point  is  graphically  con¬ 
firmed  by  the  fact  that  the  three  matrix  equations  characterizing  the  optimal 
reduced-order  state  estimator  reveal  intrinsic  coupling  (via  the  optimal 

projection)  between  the  "operations"  of  optimal  estimation  (the  modified  Riccati 
equation)  and  optimal  model  reduction  (the  pair  of  modified  Lyapunov  equations). 


Problem  Statement  and  Main  Result 


The  following  notation  and  definitions  will  be  used  throughout  the  paper: 


n,  #•,  ng,  p 


x>  y>  xe.  ye 

A,  C,  L 


positive  integers,  1  s  ng  £  n 
n,  £,  ng,  p-dimensional  vectors 
n*n,  ixn,  pxn  matrices 


A  ,  B  .  C 
e  e  e 

w^t),  t  >  0 


w2(t),  t  >  0 


n  xn  ,  n  x£  pxn  matrices 
e  e  e  e 

n-dimensional  white  noise  with 
nonnegative-definite  intensity 

Jl-dimensional  white  noise  with 
positive-definite  intensity 

T 

nxJl  matrix  satisfying  ]E[w  (t)w  (s)  ] 
V126(t-s) 

pxp  positive-definite  matrix 
rxr  identity  matrix 

transpose  of  vector  or  matrix  Z 

,_TN -1  -1.  T 

(Z  )  or  (Z  ) 


HZ),  7?(  Z) ,  p  (Z) 


null  space,  range,  rank  of  matrix  Z 


expected  value 


51  ,  ]R 


real  numbers,  rxs  real  matrices 


stable  matrix 


matrix  with  eigenvalues  in  open 
left  half  plane 


nonnegative-definite  matrix 


symmetric  matrix  with  nonnegative 
eigenvalues 


positive-definite  matrix 


symmetric  matrix  with  positive 
eigenvalues 


nonnegative- semi simple  matrix 


matrix  similar  to  a  nonnegative-definite 
matrix 


positive-semisimple  matrix 


positive-diagonal  matrix 


matrix  similar  to  a  positive-definite 
matrix 

diagonal  matrix  with  positive  diagonal 
elements 


3 


We  consider  the  following  optimal  reduced-order  state-estimation  problem. 
Given  the  system 


x  =  Ax  +  w^. 


y  =  Cx  +  »2, 


(2.1) 


(2.2) 


design  a  reduced-order  state  estimator 


x  =  A  x  +  B  y, 
e  e  e  eJ 


(2.3) 


y  =  c  x  , 
e  e  e 


(2.4) 


which  minimizes  the  error  criterion 


J(A  B  C  )  =  lim]E[(Lx-y  )TR (Lx-y  ) ] . 

t-WJO  e 

In  this  formulation  the  matrix  L  identifies  the  states,  or  linear  combination 

of  states,  whose  estimates  are  desired.  The  order  n  of  the  estimator  state 

e 

xg  is  determined  by  implementation  constraints,  i.e.,  by  the  computing  capability 
available  for  realizing  (2.3),  (2.4)  in  real  time.  Hence,  ng  is  considered  to  be 
fixed  in  what  follows  and  the  problem  is  concerned  with  determining  Ag,  B£  and  C&. 

To  guarantee  that  J  is  finite  it  is  assumed  that  A  is  stable  and  we  restrict 
our  attention  to  the  set  of  stable  reduced-order  estimators 


A  =  {(A  ,B  ,C  ):  A  is  stable}, 
e  e  e  e 


Since  the  value  of  J  is  independent  of  the  internal  realization  of  the  transfer 
function  corresponding  to  (2.3)  and  (2.4),  without  loss  of  generality  We  further 
restrict  our  attention  to  the  set  of  admissible  estimators 


A,  ■  {A  ,B  ,C  )  e  A:  (A  ,B  )  is  controllable  and  (A  ,C  ) 
+  e  e  e  ee  ee 

is  observable}. 


i  i 1  '  'Wf 


Remark  2.5.  Replacing  x  by  Sx  ,  where  S  is  invertible,  yields  the  "equiva- 

-1  e  -1  e  -1  -1 
lent"  estimator  (SA  S  ,SB  ,C  S  ).  Since  J(A  ,B  ,C  )  =  J(SA  S  ,SB  ,C  S  )  one 

e  ee  eee1  e.ee 

would  expect  the  Main  Theorem  to  apply  also  to  (SAgS  ,SBe»CeS  ; .  This  is 
indeed  the  case  since  transformation  of  the  estimator  state  basis  corresponds 
to  the  alternative  factorization  *  (S  ^G)^(SMS  ^)(sr). 

Remark  2.6.  Note  that,  for  the  optimal  values  of  Ag,  Bg  and  Cg,  (2.3) 
assumes  the  observer  form 

x  =  TAGTx  +  rSV'^y  -  CGTx  ).  (2.15) 

e  e  2  J  e 

By  introducing  the  quasi-full-state  estimate  x  =  G^x  c  ]Rn  so  that  xx  =  x  and 
a  n  e 

xg  *  Tx  e  ]R  ,  (2.15)  can  be  written  as 

x  =  xAxx  +  xQV2^(y  -  Cx) .  (2.16) 

n 
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Note  that  although  the  implemented  estimator  (2.15)  has  the  state  xg  e  H  , 

(2.15)  can  be  viewed  as  a  quasi-full-order  estimator  whose  geometric  structure 

is  entirely  dictated  by  the  projection  t.  Specifically,  error  inputs  £V  ^(y  -  Cx) 

1  T  ^ 

are  annihilated  unless  they  are  contained  in  Itf(t)]  *  B(r  ).  Hence  the  obser- 

T 

vation  subspace  of  the  estimator  is  precisely  i?(x  ). 

Remark  2.7.  Although  the  form  of  (2.16)  would  lead  one  to  surmise  that 
the  optimal  reduced-order  estimator  is  a  projection  of  the  optimal  full-order 
estimator,  this  is  not  generally  the  case  for  the  following  simple  reason.  In 
the  full-order  case  Q  (which  appears  in  Q)  is  determined  by  solving  a  single 
Riccati  equation  whereas  in  the  reduced-order  case  Q  must  be  found  in  conjunction 
with  Q  and  £  to  satisfy  all  three  matrix  equations  (2.10)-(2.12) .  Hence  the 
value  of  Q  in  the  reduced-order  case  may  be  different  from  the  value  of  Q  in  the 
full-order  case.  Thus  (2.16)  may  not  be  obtainable  by  simply  projecting  the 
full-order  result. 

To  further  clarify  the  relationship  between  §,  £  and  x,  we  now  show  that 
there  exists  a  similarity  transformation  which  simultaneously  diagonalizes  6$ 


Proposition  2.1.  There  exists  invertible  $  t  H  such  that 


r1  A3  0  4-t,  f  .  *T  0  ♦, 
0  0  0  0 


($  -  $_1 


A  0 


0  0 


i  1  0 

.  .-1  n  . 

9,  t  =  v  e  $, 

0  0 


(2.17) 


(2.18a,b) 


n  xn  . 

0  0  A 

where  A^,A£  e  3R  are  positive  diagonal,  A  =  A^A£  and  th®  diagonal  elements 

of  A  are  the  eigenvalues  of  M.  Consequently, 


Q  -  T$,  P  -  Pt. 


(2.19) 


3.  Proof  of  the  Main  Theorem 

The  proof  proceeds  exactly  as  In  [6].  Using  the  fact  that  A+  is 
open,  the  Fritz  John  version  of  the  Lagrange  multiplier  theorem  can  be  used  to 
rigorously  derive  the  first-order  necessary  conditions 


0  ■  AQ  +  QA  +  V, 


0  «  A  P  +  PA  +  R, 


0  =  Pl2Ql2  +  P2V 


J  T  .  „-l „T 


Be®  “^p2  P12^1  +  ^12^C  +  P2  P12V12^V2  ’ 


(3.1) 


(3.2) 


(3.3) 


(3.4) 


and  (n+ng)x(n+ne)  0,  P  are  partitioned  Into  nxn,  nxng  and  ngxne  subblocks  as 


Expanding  (3.1)  and  (3.2)  yields 


0  =  AQ1  +  01  at  +  Vr 

(3.6) 

0  =  AQ12  +  Q12AI  +  Ql(BeC)T  +  vi2Be* 

(3.7) 

°  =  AfiQ2  +  Q2aJ  +  BeCQ12  +  Q{2(BeC)T  +  B^B*. 

(3.8) 

0  =  ATP1  +  P.,A  +  (BeC)Tp{2  +  P12BfiC  +  STRS, 

(3.9) 

0  =  P12Ae  +  ATP12  +  (BeC)TP2  -  STRCe, 

(3.10) 

0  =  AeP2  +  P2Ae  +  CJR2Ce’ 

(3.11) 

Note  that  (3.9)  Is  superfluous  and  can  be  omitted.  Writing  (3.8)  as  (see  [13,14]) 

o  -  (VBeCQ12Q>2  +  V  Wl 

where  Q*  Is  the  Moore-Penrose  or  Orazln  generalized  Inverse  of  Q2,  It  follows 

from  Lemmas  2.1  and  12.2  of  [11]  that  Q2  Is  positive  definite.  Similarly, 

(3.11)  Implies  that  P2  Is  positive  definite.  This  justifies  (3.4)  and  (3.5). 

Now  define  the  nxn  nonnegatlve-deflnlte  matrices  (see  [13,14]) 


0  =  Q-|-Q12Q2  Q12'  Q  ‘  Q12Q2  Q1 2 ’  P  =  P12P2  P12’ 


and  note  that  (3.3)  Implies  (2.5)  and  (2.6)  with 


s  =  q;Vo. 


m  =  q„p0,  r*  -p;1?!,  • 


M  Is  positive  semisimple.  Sylvester's 


Since  Q2P2  =  P^1 72 ( P J/2 Q2 Pg72 ) pj/2 . 

Inequality  yields  (2.13).  Note  (2.19)  and  the  Identities 


Q}  =  0  +  Q. 

aT  at 

Q12  =  Qr  ’  P12  =  _PG  • 

AT  A  T 

q2  =  ror,  p2  =  gpg1  . 


(3.12) 

(3.13) 

(3.14) 


Using  (3.12)— (3.14) ,  (3.4)  and  (3.5)  yield  (2.8)  and  (2.9).  Also, 
RHS(3.8)-RHS(3.7)  yields  (2.7).  Substituting  (2.7)-(2.9)  Into  (3.6)-(3.8),  (3.10) 
and  (3.11),  It  can  be  seen  that  (3.8)  and  (3.11)  are  also  superfluous.  Finally, 
linear  combinations  of  the  remaining  three  equations  (3.6),  (3.7)  and  (3.10)  yield 
(2.10)— (2.12) . 


.  Concluding  Remarks 

The  question  of  multiple  local  minima  satisfying  the  optimal  projection 
equations  for  reduced-order  state  estimation  and  the  problem  of  constructing 
numerical  methods  for  solving  these  equations  are  beyond  the  scope  of  this  note. 
It  should  be  pointed  out,  however,  that  promising  numerical  results  for  the 
model-reduction  and  fixed-order  dynamic-compensation  problems  have  been  obtained 
by  means  of  iterative  algorithms  that  take  full  advantage  of  the  presence  and 
structure  of  the  optimal  projection  ([2,4,7]). 

Finally,  the  results  of  this  paper  can  be  extended  to  include  the  follow¬ 
ing  related  problems:  1)  discrete-time  system/discrete-time  estimator;  2) 
infinite-dimensional  system/finite-dimensional  estimator  ([5]);  and  3)  parameter 
uncertainties  (,[1,15]). 
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